Sunday, March 12, 2017

A Short Introduction to Makefiles

Makefiles are really good at one thing, managing dependencies between files. In other words, make makes sure all files that depend on another file are updated when that file changes.

We tell make how to do this by declaring rules. A typical rule looks like this:

A Simple Makefile

# Makefile
# Create bundle.js by concatenating jquery.js lib.js and main.js
bundle.js: jquery.js lib.js main.js
 cat $^ > $@

There are three parts to this rule:

  • The target, bundle.js, before the colon (:).
  • The prerequisites (what the target depends on), jquery.js lib.js main.js, after the colon (:).
  • The command, cat $^ > $@, on the next line after a leading tab, (\t).

There are two "automatic" variables in this command.

  • $@ - filename representing the target, in this case bundle.js.
  • $^ - filenames representing the list of the prerequisites (with duplicates removed).

"Automatic" means that the variables are automatically populated with relevant filenames. This will make more sense when we get into patterns later.

Here are some more variables that are useful.

  • $< - filename representing the first prerequisite.
  • $? - filenames representing the list of the prerequisites that are newer than the target.
  • $* - filename representing the stem of the target, in the above case bundle.

The Make Manual contains the full list of automatic variables

Execution

Running make with the above Makefile results in the following execution.

$ make
cat jquery.js lib.js main.js > bundle.js

$ make
make: 'bundle.js' is up to date.

make runs the first target it finds in the file if none is given on the command line. In this case it is the only target.

The second run didn't do anything since bundle.js is up to date. To be up to date means that the last-modified time of bundle.js is newer than any of its prerequisites' last-modified times. Simple but powerful! When we create new targets all we have to worry about is making sure that our targets know what files it depends on, and what files they depend on, and so on.

It is possible to enter many targets on the left of the colon (:). make will treat them as separate rules and the automatic variable will make sure that the correct files are built.

But, since make treats the rules as separate rules, it will only build the first of them, the default target.

# Makefile
bundle.js bundle2.js: jquery.js lib.js main.js
 cat $^ > $@
$ make
make: 'bundle.js' is up to date.

If we want to build bundle2.js, we can do it by explicitly telling make to do it by giving the target as command line parameter.

$ make bundle2.js
cat jquery.js lib.js main.js > bundle2.js

To get make to build both targets at once, we need to add a new, .PHONY:, target.

# Makefile
.PHONY: bundles
bundles: bundle.js bundle2.js

bundle.js bundle2.js: jquery.js lib.js main.js
 cat $^ > $@

Running make now results in (after removing bundle*)

$ make
cat jquery.js lib.js main.js > bundle.js
cat jquery.js lib.js main.js > bundle2.js

A .PHONY: target is a target without a corresponding file for make to check last modified time on. This means the target will always be run, forcing make to check if all the target's prerequisites needs to be built. The .PHONY label is not strictly necessary. If it is left out, make will check to see if there is a file called bundles and since there isn't one it will build it anyway.

Here's an illustration:

# Makefile
build:
 echo 'Running build'
# Makefile2
.PHONY
build:
 echo 'Running build'
$ touch build
$ make
make: 'build' is up to date.
$ make -f Makefile2
echo 'Running build'

Marking a target that doesn't represent a file as .PHONY: is easy to do and avoids annoying problems once your Makefile grows.

.PHONY: clean

Conventionally every Makefile contains a clean target to remove all the artifacts that are built. In the above case it would contain something like:

# Makefile
.PHONY: clean
clean:
 rm -f bundle*.js

make clean will now clean out all files created by the Makefile.

Directories

Directories in make usually needs a bit of special treatment. Let's say we want the bundles above to end up in a build directory. The following Makefile illustrates a problem.

# Makefile
bundles: build/bundle.js build/bundle2.js

build/bundle.js build/bundle2.js: jquery.js lib.js main.js
 cat $^ > $@

Running make illustrates the problem:

cat jquery.js lib.js main.js > build/bundle.js
/bin/sh: build/bundle.js: No such file or directory
make: *** [build/bundle.js] Error 1

The directory is not automatically created by cat. There are three ways to solve this and one is better than the others.

  1. Add mkdir -p to all rules creating files in the directory.
  2. Add a prerequisite to create the directory on the bundles target.
  3. Add an ordering prerequisite (|) to the rules creating the files in the directory.

1. is not good because the directory will be created more than once, one for each bundle (this is why the -p is needed). 2. is not good because the build directory is not a prerequisite target for bundles. 3. is good because the build directory is clearly a prerequisite of the rule that creates the bundles in this directory.

The reason we have to use an ordering prerequisite instead of a normal prerequisite is that cat would fail otherwise. Here's the resulting good Makefile.

# Makefile
bundles: build build/bundle.js build/bundle2.js

build:
 mkdir build

build/bundle.js build/bundle2.js: jquery.js lib.js main.js | build
 cat $^ > $@

clean:
 rm -rf build

Patterns

Now, we know the basics of Makefiles. We can create rules with targets, prerequisites and commands that are run when needed. But, we have been working with named files all this time. This works fine for small examples like above, but when we have hundreds of files this quickly gets out of hand. Patterns to the rescue.

Let's say that we have a bunch of images that we would like to optimize by running them through an optimizer. The images are in the images/ directory and the optimized images are built into build/images. The naive (and not working) way to do this is shown below. (I'm faking optimize with a simple copy, cp.) The % sign is glob matched with the part of the filename that is not literal.

# Makefile (NOT WORKING)
optimize: build/images/*

build/images/%: images/% | build/images
 cp $< $@

build/images:
 mkdir -p $@

To see why this is not a viable Makefile, we try to run it with make.

$ make
mkdir -p build/images
cp images/a.png build/images/*.

$ tree build
build/
└── images
    └── *

What is going on here? Why is only one image copied and why is it copied as name build/images/*? The problem is that the target files don't exist yet and the * is interpreted literally. If we copy the files into the build directory and touch the source files, it works the way we want.

# Copy the image directory into build
$ cp -r images build
# Touch the orignal images
$ touch images/*
# Build works since build/image/* evaluates to the list of images
$ make
cp images/a.png build/images/a.png
cp images/b.png build/images/b.png

Here is the main rule to know about patterns. The target file list has to be created from the available source files. To do this we have help of a number of functions, including wildcard, shell, etc. shell will allow us to call anything that we can call from the shell This is very powerful!

How do we solve the above problem? We can do this by getting a list of source images and transforming this list into a list of target images. This is easily done.

# Makefile

# 1. Get the souce list of images
images := $(wildcard images/*.png)

# 2. Tranform the source list into the target list
target_images := $(images:%=build/%)

# 3. Our default target, optimize, depends on all the target_images
optimize: $(target_images)

# 4. Build the targets from the sources, make sure build/images exist
$(target_images): build/% : % | build/images
 cp $< $@

build/images:
 mkdir -p $@

The first line introduces both variables and functions.

Variables can be declared in a number of ways, but the :=-declaration is the simplest. It evaluates the value on the right and sets the value on the left to the result, like variables in most programming languages.

Functions are called with the $() construct, and wildcard is a function that evaluates a shell filename pattern and returns a list of filenames.

The full line above populates images with the .png files from the images directory.

The second line converts the source images into the target images. Variables are evaluated the same way as function calls, with the $() construct. By adding a colon-equals expression, a variable substitution reference, after the variable name we can substitute a pattern for another. Example

files := "src/a.java src/b.java"
# Pattern replaces the files into "target/a.class target/b.class"
class_files := $(files:src/%.java:target/%.class)

The third line tells make that our optimize target depends on all the targets existing. This makes sure that all the targets are built.

The fourth line sets up the targets `$(target_images) and its prerequisites with a static pattern rule. The pattern does the opposite of the variable substitution reference above, it deconstructs a single target into the source it depends on. The final part of the of this line is an order prerequisite on the rule to create the directory.

A Recipe for Creating Makefiles

  • Create a list of targets that you want to create from the sources. You have the full power of bash, python, awk, etc. at your disposal.
  • Create a static pattern rule to convert a single target into the source it can be created from.
  • Add order prerequisites to make sure directories are automatically created.
  • Add a callable target that depends on all the target files that you want to create.

Commented Example

Here's a more exotic example of what you can use make for. We have a directory of Javascript source files in lib. The corresponding test files are in test. There may be multiple directories below both lib and test. The testfiles are named like the source files with an added .spec after the stem of the filename.

We want to use a makefile to help us run only the tests that are relevant based on the files that are changed. To keep track of what tests have been run we're going to use marker files and touch them every time a test is run.

# Create the list of test files by using the shell function and find
test_files := $(shell find test -name '*.spec.js' -print)

# Convert the test files into marker files with variable substitution
# A marker files looks like this tmp/model/person_test.marker
marker_files := $(test_files:%.js=tmp/%.marker)

# Do the same thing for the test directories
test_dirs := $(shell find test -type d -print)

# The marker directories have their normal names, no special ending
marker_dirs := $(test_dirs:%=tmp/%)

# test is the default target
.PHONY: test
test: $(marker_files)

# The marker files order depend on the marker directories
$(marker_files): | $(marker_dirs)

# Marker files depend on the source files
# Deconstruct a marker file into a source file
$(marker_files): tmp/test/%.spec.marker : lib/%.js

# Marker files depend on the test files
# Deconstruct the marker file into a test file
# When any prerequisite changes, run the tests and then touch the marker file
$(marker_files): tmp/%.marker : %.js
 mocha $<
 @touch $@

# Create the marker dirs
$(marker_dirs):
 @mkdir -p $@

# Clean the project by removing the entire tmp directory
.PHONY: clean
clean:
 rm -rf tmp

Now whenever you run make, it will run only the relevant tests.

# Modify a test file
$ touch test/models/passbook.spec.js
$ make
mocha test/models/passbook.spec.js
  passbook
    ✓ generate pass strips image names
    ✓ doesnt crash with no store number
  2 passing (21ms)

# Modify a source file
$ touch lib/models/passbook.js
$ make
mocha test/models/passbook.spec.js
  passbook
    ✓ generates pass strip image names
    ✓ doesnt crash with no store number
  2 passing (22ms)

Makefiles are really good at one thing: building only stale files. If that is our problem, we should give make a try.

No comments: