First-class Functions in Make

Make allows users to define and call functions, but its built-in facility for invoking user-defined functions requires that the function definition be bound to a variable name. One can easily pass and return function names that can be called elsewhere, one cannot define an anonymous function within an expression or manipulate functions (not just their names) as first-class data values.

Experimentation reveals that there are some "interesting" characteristics of GNU Make that allow us to effectively achieve anonymous functions, and more importantly, treat them as first-class data values. This approach even allows first-order functions to be written (functions that take functions as parameters and/or return then as values).

Hack #1 : Double Expansion

The first interesting characteristic we can exploit is the ability to double-expand a variable value. Ordinarily, variables are expanded exactly once. Expansion replaces $(...) substrings within a variable with the results of function calls or expansions of other variables. With := assignments the right-hand-side of the assignment is expanded immediately when the assignment is processed. With = assignments, the RHS is expanded lazily, or when the assigned variable is later referenced. In either case, it is expanded once.

But it just so happens that there is a context in which a value will be expanded an extra time: within the arguments of a $(call ...) function that is performed on a built-in function that takes a variable number of arguments. This does not happen when call is used to invoke a user-defined function (the normal case). It does not happen with built-in functions that take a fixed number of arguments (the vast majority). This only applies to or, and, and if. And it does not happen when you directly call these built-in functions ... only when you call them indirectly via call.

x = -X-

$$x
"$x"
$(or $$x)
"$x"
$(call word,1,$$x)
"$x"
$(call or,$$x)
"-X-"
$(call and,$$x)
"-X-"
$(call if,1,$$x)
"-X-"

Interestingly, the GNU Make documentation does obliquely mention that expansion of variables when calling built-in functions may result in "unexpected results". Note the use of the term "unexpected", rather than "undefined". I call that a feature. Well, okay, perhaps it is arguable whether this is a feature or a bug of GNU Make ... but it is unarguable that it is the behavior of GNU Make.

There are some extremely useful applications of this. Consider:

$$% : ; @# $(info $* --> "$(call or,$$$*)")

This allows you to type arbitrary function calls or variable expansions on the command line to see how they evaluate within the makefile. Try " make '$(abspath .)' " for example.

Hack #2: Passing arguments

Given the ability to expand strings on demand, we only need the ability to bind variable names to values on demand, and then we will essentially have lambda expressions.

foreach expressions provide a way of binding arbitrary variable names. The bindings are limited to the duration of the foreach call, but they are visible within nested function calls, which is just what we want. Here is an example using a named function:

doubleA = $a$a

$(foreach a,1 2,$(call doubleA))
"11 22"

Let's take as an example the function map. The built-in foreach function is a form of map that operates on space-delimited words. To make the example non-trivial, our map will operate on colon-delimited words. The straightforward approach would be to employ a named function:

map = $(subst $() ,:,$(foreach a,$(subst :, ,$2),$(call $1,$a)))
mapQuote1 = [$1]
mapQuote = $(call map,mapQuote1,1:2:3)

$(call mapQuote,1:2:3)
"[1]:[2]:[3]"

This map function lacks some flexibility that foreach has. For one thing, we need to define a helper function to get anything done. Even worse, we cannot pass parameters to mapQuote. We could modify map to accept another parameter, but then you might want to pass two parameters. How many versions of map might be needed?

Using the expansion trick, we can create a map function that accepts an anonymous function, avoiding these problems:

map2 = $(subst $() ,:,$(foreach a,$(subst :, ,$2),$(call or,$1)))

mapQuote2 = $(call map2,$1$$a$2,$3)

$(call mapQuote2,[,],1:2:3)
"[1]:[2]:[3]"
$(call mapQuote2,<,>,1:2:3)
"<1>:<2>:<3>"

Note that the anonymous function body -- $1$$a$2 -- must prefix its arguments with $$, because it will be evaluated once before being passed to fmap. The single $ expansions -- $1 and $2 -- refer to variables in the context in which the anonymous function appears, and $$ to refer to the anonymous function's own arguments.

This map example would have used a foreach function anyway, but we could insert a foreach just for the purpose of binding a variable name. This can get verbose, especially when more than one variable is assigned, but the biggest problem with foreach is that it comes with the limitation that the values cannot contain spaces or the empty string. Which brings us to...

Hack #3 : Positional Parameters

A more general solution is to use the positional names assigned by $(call ...). When a user-defined function is called, the variables $1, $2, etc., are assigned to its parameters. But if you try $(call or,{func},{arg1},...) you will find that these variables are not available to {func}. This is because the automatic variables are not created for built-in functions (who would have noticed without the double expansion hack?).

As a result, we must call an ordinary named function to bind the automatic variables, which will then call our anonymous function. Normally the automatic variables for one function call will not be visible to a nested call, but again built-in functions are apparently a special case.

Note that the helper function itself is argument 1, so our function's arguments begin at index 2: $$2, $$3, etc.. A bit odd, but if you have read this far you must have a high capacity for dealing with the bizarre.

apply = $(call or,$1)

$(call apply,[$$2],x)
"[x]"

fmap = $(subst $() ,:,$(foreach a,$(subst :, ,$2),$(call apply,$1,$a)))

$(call fmap,[$$2],1:2:3)
"[1]:[2]:[3]"

Examples

So now we should be able to do something interesting. Here are two implementations of fold. These use anonymous function $1 to combine (or accumulate) all of the values in $3, starting with the value $2. The foldl version starts at the first element in the list, and the foldr version starts at the last element.

cdr = $(wordlist 2,9999,$1)# 9999 far exceeds the limit of recursion

foldl = $(if $3,$(call foldl,$1,$(call apply,$1,$2,$(word 1,$3)),$(call cdr,$3)),$2)
foldr = $(if $3,$(call apply,$1,$(word 1,$3),$(call foldr,$1,$2,$(call cdr,$3))),$2)

$(call foldr,$$2$$3$$2,.,a b c)
"abc.cba"
$(call foldl,$$2$$3$$2,.,a b c)
".a.b.a.c.a.b.a."
$(call foldl,$$(patsubst $$3,%,$$2),abcdef,ab% c% %ef)
"d"
$(call foldr,$$3 $$2,,1 2 3)
" 3 2 1"

The following while function is a purely functional equivalent of a procedural loop. It will recursively apply function $2 to $3 until $1 evaluates to false (the empty string).

while = $(if $(call apply,$1,$3),$(call while,$1,$2,$(call apply,$2,$3)),$3)

$(call while,$$(findstring ||,$$2),$$(subst ||,|,$$2),this|is|||a|||||test)
"this|is|a|test"

Even more interesting is using combinators to construct functions from other functions. A good example of the power of higher order programming is in constructing a function that matches a string. Here is a very simple pattern language that supports only alternatives. For example, the string "a|b%" will match "a" or any string beginning with "b". The function matchFn will generate an anonymous function that tests a string to see if it matches the pattern in $1.

mlit = $(if $(filter $1,$2),$2)
matchAlt = $(if $(word 2,$1),$$(or $(call matchFn,$(word 1,$1)),$(call matchFn,$(wordlist 2,999,$1))))
matchFn = $(or $(call matchAlt,$(subst |, ,$1)),$$(call mlit,$1,$$2))

$(call matchFn,a|b|c)
"$(or $(call mlit,a,$2),$(or $(call mlit,b,$2),$(call mlit,c,$2)))"

match = $(call apply,$(call matchFn,$1),$2)

$(call match,abc|def|ghi,def)
"def"
$(call match,abc|def|ghi,d)
""

No matter how complex it is, the resulting anonymous function accepts a single argument, so we can use it without passing around additional arguments to functions like fmap:

$(call fmap,$(call matchFn,f%|%x),foo:bar:baz:qux)
"foo:::qux"

Footnote

I hope that this set of examples has illustrated that the functional language embedded within GNU Make possesses computational capabilities, expressive power, and a capacity for illegibility that you had not imagined, and has at the same time demonstrated that the use of this power should be considered ill-advised, mischievous, masochistic, or perhaps downright malicious.

If you find writing GNU Make "code" as interesting as I do, you may also enjoy programming INTERCAL, UnLambda, Befunge, or Piet. Google at your own risk.

In the meantime, you can browse the sources for this example.