Anonymous closures in Python
Anonymous Closures In Python
If you’re a Scheme or Javascript programmer, you’re probably familiar with code like this:
var toggle = (function() {
var state = false;
return function() {
= !state;
state return state;
}; })()
In English, this roughly translates to “Run a function which creates
a boolean called state
and
returns a function which returns the value of state
, flipping it each time.”
Each time we run toggle
,
we get the opposite value to the last time. The sequence is guaranteed
to be true
, false
,
true
,
false
, …
regardless of what any other code is doing, since the value of state
is only available to the
toggle
function; nobody else
can change it’s value, and we can see that toggle
will flip it every time.
Hence the guarantee.
This makes use of two different but related constructs: ‘anonymous functions’ and ‘closures’.
Anonymous Functions
For a function to be ‘anonymous’, it must not have a name. This means that it must be a value, for example:
var full_url = 'http://' + domain + '/' + path;
We read this as “Stick together the text 'http://'
with the value we’ve called domain
. Then stick the result on to
the text '/'
.
Then stick this result on to the value we’ve called path
, and call the final result
full_url
.”
We use 3 names here (domain
, path
and full_url
), so where does anonymity
come in to it? The anonymous things are those I’ve referred to as
“results”. At one point we are manipulating the result of ('http://' + domain)
;
what is the name of this temporary, unfinished piece of text? It doesn’t
have a name at all, we just pass it straight on to the next +
. Likewise
for all of the intermediate values, until eventually we give the name
full_url
to the result, which
would otherwise be anonymous too.
The same thing happens in our toggle
example. Here we have
anonymous functions, which are referred to in the English as ‘a function
which…’. We have two anonymous functions here: the first function which
is defined, run, then thrown away in favour of its output. This is
similar to the text above which is created, sent to the next +
, then
thrown away in favour of the output of the +
.
Likewise, inside the first anonymous function we have another function
which is defined, only to be sent back as a return value. In this case
we do eventually give it a name, toggle
, but not in its original
context. The ability to manipulate and pass around functions requires
them to be values, just like bits of text, numbers, booleans, etc.
Closures
A closure is a function which contains free (ie. non-argument)
variables along with values for those free variables (known as the
“environment”, “state” or “context” of the closure). In our example
above, the inner function uses the variable state
but
doesn’t define it or accept it as an argument. As far as this function’s
concerned, state
is “free” to
be anything; the definition of the function is “open”, since its
behaviour is unknown if we don’t know state
.
The outer function does define a variable called
state
, so this is used to
plug the free variable hole in the inner function; the definition of
state
“closes” the inner
function’s definition, or the inner function is “closed over state
” (in the same way that the
Naturals are “closed over addition”), or that the inner function gets
closure from state
, or the
inner function is a “closure”.
Closures are a very powerful tool, and they complement anonymous
functions well. In this example, we create our closure and return it.
This allows the closure to be stored in a variable (toggle
) and called over and over,
even though the function which defines state
only gets called once. This
simple idiom causes the value of state
to persist between calls to
toggle
, so that it correctly
returns alternating values each time.
Since the value of state
is only accessible by the outer and inner functions above, and since the
outer function is anonymous and never stored in a variable, the only
code which has access to state
is toggle
. This makes closures useful
for encapsulation; looking after potentially-fragile data such
that the rest of the world can only access it in safe and relevant ways.
In our case above, it is safe to rely on toggle
to give opposite answers
every time it’s called, since nothing can ‘reset’ the value of state
, or break toggle
by turning state
into, for example, a
number.
In Object Oriented languages there is the concept of “private members”, but these are a crude approximation to free variables with closure. For example, in a class/instance Object Oriented language, objects can have private members and methods can access those private members. However, methods cannot have private members. Likewise methods generally aren’t given methods. Everything in Object Oriented languages happens at the value level, which severely restricts the ability to abstract (since we don’t have anonymous function values).
Note that it may seem rather arbitrary that we take the value of
state
from the enclosing
function’s scope. Indeed it is arbitrary, and there have been many
alternative implementations. For example, McCarthy’s original LISP takes
free variables from the scope that the function is called in;
this is known as dynamic scope, which still lives on in Emacs LISP. The
convention of taking free variables from the scope where the function is
defined, called lexical scope, came later and went mainstream
with ALGOL 60. The virtues of lexical scope compared to dynamic scope
became clear when it was incorporated into LISP, which resulted in
Scheme. Scheme is a very lightweight language compared to most other
LISPs (Common LISP, MACLISP, etc.) because it applied lexical scope and
(closely-related) continuations uniformly to approach most problems;
thus there are fewer edge-cases, fewer control structures, and much less
of a combinatorial explosion of incompatibilites.
The question is: how can we do this in Python?
Closures in Python
Python has anonymous functions, which look like the following:
lambda x: x + 5
lambda x:
means
the same as function (x) {
in Javascript, and there’s no need to write return
, since
in Python an anonymous function can only ever contain a single
expression, which is assumed to be the return value. This restriction to
one expression, and thus no statements, is a Good ThingTM
since it can make reasoning about code much easier; if our functions can
only do one thing, then we’re encouraged to make small, simple,
easy-to-understand functions, then compose them together. Also, since
we’re only allowed an expression, and not a statement, it is implied
that anonymous functions will be pure and side-effect-free.
This restriction, however, also poses a problem when we try to use
state. For example, we can’t write the Javascript line var state = false;
,
since the Python equivalent state = False
is a statement and not an expression.
We can get around this quite easily by turning local definitions into arguments. For example, we can rewrite our Javascript as:
var toggle = (function(state) {
return function() {
= !state;
state return state;
}false); })(
Here we’ve moved the declaration of state
into the argument list of the
first function, so our outer function doesn’t define anything, it’s
become parameterised by a state. The assignment has moved into the
function call, so state
will
initially be false
due
to Beta reduction. This method of capturing variables translates quite
readily to Python:
= (lambda state: (lambda *_: not state))(False) toggle
However, this isn’t quite the same. Firstly, this version of the
toggle
function can take an
argument. This is actually OK, since the argument is a ‘vararg’
(indicated by the asterisk) which makes it optional. I will use the
standard convention of calling things _
if they are unused.
The second difference with our Python version is actually very
important: the value of state
never gets flipped, so toggle()
always gives the result True
! This is
because the Javascript line state = !state;
is also a statement, not an expression, so we can’t include it in our
Python function.
Since Python is not a pure (side-effect-free) language there are expressions which cause side-effects, which we can use instead of statements. The problem is, they usually don’t give us the values we want to return from our lambdas, so we need to work around the single-expression limitation and include multiple: those which perform the side-effects we want, and one for the return value. I claim that this limitation is actually liberating, since we’re forced to implement our own evaluation strategy, and thus we can tailor it to the problem at hand.
A simple solution is to wrap many expressions inside a container (in
this case I’ve chosen a tuple, which Python guarantees to evaluate
in-order) then pick out the one we care about. Constructing and
destructing a container constitutes an expression, so we can use it in
our lambdas. For example, if we want to write the string "hello"
to the console, and return the string "world"
,
we can do the following:
lambda *_: (sys.stdout.write("hello"), "world")[1]
This function ignores any arguments (it’s a thunk), then it creates a
tuple. The first value in the tuple is None
, the
result of sys.stdout.write("hello")
,
which has the side-effect of sending "hello"
to the output buffer. The second value is the string "world"
.
We then pull out the second value of our tuple as our result, ie. we
return "world"
.
Now we need to find an expression which is equivalent to the line
“state = !state;
”
in our Javascript. Unfortunately we can’t do this directly, since in the
Javascript we’re dealing with variable names, and simply changing which
values they point to, whereas in our Python expressions we’re dealing
with the actual Boolean values True
and False
themselves. Understandably, Python will not let us change True
into False
or vice
versa, so we need to create a pointer-based abstraction, similar to
using variable names like the Javascript. A simple way to do this is to
keep our values in lists, and replace the contents of the lists when we
want to ‘update’ a value. Voila:
= (lambda state:
toggle lambda *_: (state.append(not state.pop()), state[-1])[1])
(False]) )([
Here our state
value is given
as [False]
, which
is a list containing the value False
. The
side-effecting expression we use is state.append(not state.pop())
which first “pops” the last value out of state
, so that state
becomes an empty list []
and we receive the value False
. We then
use the not
operator on
this value (equivalent to Javascript’s !
) to
receive a value of True
. We then
use the state.append
method to
push this value on to the end of state
, so that state
has become a list containing the
value True
, or [True]
. Next
time it will start with [True]
and
result in [False]
. Like
before, we extract the second element of our tuple to produce our return
value, correctly implementing the toggling behaviour.
Stacks
The above may seem a pretty pointless exercise, and I would agree that production code should probably be written in a more straightforward way, but now that we’ve thrown away Python’s built-in statement-sequencing and namespace implementations, we’ve actually exposed some pretty interesting problems, which we can apply some nifty Computer Science to.
The observant of you will have spotted that popping and appending lists like I’ve just done is actually stack manipulation. Since I’ve just exposed a stack-oriented subset of Python, it makes sense to look at some stack languages.
The exemplar of a language oriented around stacks is Forth; however, in Forth everything is a procedure (statement), which goes against the nice expression-only form we were inching towards above. In Forth, the stacks are global, mutable and, most importantly, external entities. Forth procedures can be used to manipulate stacks, but only in the same way that C procedures can be used to manipulate memory. Neither actually has a “stack”(/“memory”) datatype; the stack(/memory) lives outside the program, and the language provides means to access it. In fact, the only reason Forth is ‘stack-oriented’ rather than ‘memory-oriented’ is that its procedures are generally written to make heavy use of the stacks; but they don’t have to, they could just as well write to memory instead.
There is actually another nice stack language out there which is expression-based; in fact it’s purely functional! It’s called Joy and rather than running procedures which may happen to modify an external stack, in Joy everything is a function which accepts a stack and returns a stack.
For example, consider dup
which pops a value
off the stack and pushes it back on twice. Although its before/after
behaviour in Forth and Joy is identical (the operational semantics), the
interpretations of what’s happening (the denotational semantics) are
very different.
In Forth, there is a stack containing a value x
. dup
is a procedure
which takes no arguments and returns no values (just like everything in
Forth). As a side-effect of running dup
,
Forth looks at its stack and pops x
off the
top. Forth then pushes x
back on to the
stack, then pushes another x
on top as
well.
In Joy, dup
is a function. To call dup
, we must give it a stack, and it will give us
back a stack, but since in Joy everything does this, we don’t need to
bother writing it; it’s implied. The stack which dup
returns is the input stack with a duplicate of
the top value.
In Python we can write this as:
= lambda xs: xs + [xs[-1]] dup
There are a lots of other stack functions we can make; in fact, since stack languages are Turing-complete, we can make any realisable function as one between stacks. However, there’s no point implementing more than we need. As I wrote about before, [Brent Kerby] 1 conjectures (by analogy to Schoenfinkel’s applicative combinatory logic) that a Universal set of stack functions is as concise as:
# Pops off two values, calling the top one on the remaining stack.
# Similar to Schoenfinkel's K combinator 'lambda x: lambda y: x'
= lambda xs: xs[-1](xs[:-2])
k
# Pops off two values and pushes two new functions. The first function
# pushes the second value then calls the first value. The second
# function calls the first value then pushes the second.
= lambda xs: xs[:-2] + \
cake lambda ys: xs[-1](ys + [xs[-2]])] + \
[lambda ys: xs[-1](ys) + [xs[-2]] ]
[
# Optional; function composition (saves writing lambda all the time!)
= lambda f: lambda g: lambda x: f(g(x)) o
With these functions defined we can create any mapping we like from lists to lists, which means we can express any kind of mutable state we like by putting values in lists and updating them with closures.
Arguments
In a similar vein to stacks, there is another nugget of Computer Science lurking around here known as de Bruijn indexing.
Let’s say we have a Javascript function like this:
function(a) {
(return (function(b) {
return function(c) {
= a + b;
a = b * c;
b return a + b + c;
;
}2);
})(1); })(
We can write this in Python using the techniques above, to get the following:
# Direct use of Python
= (lambda a:
direct lambda b:
(lambda c: (
(+ b[-1]),
a.append(a.pop() * c),
b.append(b.pop() -1] + b[-1] + c
a[-1])
)[2])
)([1])
)([
# Using stack manipulation we would need to discard the function above and start
# from scratch. My first attempt gives a function something like this, where q()
# is quotation and the rest of the functions are Kerby's concatenative
# combinators (note, this code probably won't run ;) ):
= lambda xs: xs[-1](xs[:-1])
i = lambda xs: xs[-1](xs[:-2]) + [xs[-2]]
dip = lambda xs: xs[:-2] + [xs[-1], xs[-2]]
swap = lambda x: [x]
q
q(
q(q(q(dup), dip, swap), dip, swap),
dip,
dup,
q(swap, dup, q(i), dip),
dip,
swap
), dip, swap, q(
dup,
q(q(q(q(dup), dip, swap), dip, swap), dip, swap, i),
dip
),
dip,
q(q(q(dup),dip, swap), dip),
dip,
dup,
q(i, q(dup), dip),
dip, i
This takes a stack:
[[lambda xs: cons(xs[-1] + xs[-2])(xs[:-2])], # Addition lambda xs: cons(xs[-1] * xs[-2])(xs[:-2])]), # Multiplication a, b, c] # Equivalent to ‘a’, ‘b’ and ‘c’ above
and gives a stack:
[addition, multiplication, (a+b), (bc), (a+b+bc+c)]
Stack manipulation can be very powerful, but here we end up doing a lot of shuffling around, since we haven't given ourselves a namespace to define new functions, etc.
There is an alternative, which doesn't require re-working our function-based algorithms, but which closely resembles a stack language. The trick is to look at the function definition and spot that `a`{.python}, `b`{.python} and `c`{.python} are meaningless placeholders. By using letters of the (latin) alphabet, we're effectively indexing our arguments, in the same way that we'd index a list. But we don't use letters to index lists (we'd lose all ability to perform arithmetic, etc.) so why do it for arguments? Let's index our arguments with Natural numbers instead!
A rather direct interpretation might go as follows:
```python
(lambda _0:
(lambda _1:
(lambda _2: (
_0.append(_0.pop() + _1[-1]),
_1.append(_1.pop() * _2),
_0[-1] + _1[-1] + _2
)[-1])
)([2])
)([1])
However, I shall proclaim that this breaks code locality, since we get an implicit dependency on global information (the current argument number). We can aleviate this by reversing the numbers:
lambda _2:
(lambda _1:
(lambda _0: (
(+ _1[-1]),
_2.append(_2.pop() * _0),
_1.append(_1.pop() -1] + _1[-1] + _0
_2[-1])
)[2])
)([1]) )([
This way, the definitions of the inner functions don’t depend on knowledge of their context, at the expense of outer functions requiring knowledge of their internals (which seems fair enough to me). There is effectively an API that the outer functions must provide and the inner functions must consume.
This is the right idea, but these variable names are still opaque to the computer. They only mean something to the programmer (unless we use ‘eval’, at which point all bets are off). Every time we hide our intentions from the computer, whether intentionally through obfuscation or accidentally through bad design, we’re just making life harder for ourselves down the line, since the computer won’t be able to help us.
Here’s an alternative which uses a list to hold all arguments, but rather than push, pop and apply functions to the list like we did for the stack implementation, we can index it using actual, information- rich numbers (how many arguments back to go). This is better than using contrived names (eg. if we’d used a dictionary), since the computer understands the names too:
# "pushing" a value in an environment sticks the value on the top of
# the stack
= lambda env: (lambda x: env.append(x))
push
# "popping" an environment removes a value from the top of the stack
= lambda env: (lambda _: env.pop())
pop
# "peeking" into an environment returns a value from the stack without
# removing it.
# This requires an extra layer of indirection to get around Python's
# call-by-value semantics; instead of returning the value, we return a
# function which will apply the value to its argument. This assumes
# that every value is a function, but that's nothing to worry about
# if we're consistent
= lambda env: (lambda n: (lambda x: env[-1 * (n+1)](x)))
peek
# "prepare" is a simple wrapper which turns a regular function into
# one which pushes and pops arguments
= lambda (push, pop):
prepare lambda f:
(lambda x:
(None), pop(None))[1]))
(push(x), f(
# "running" a function creates a new environment and passes
# specialised instances of "peek" and "prepare" for that environment
# to the function
= (lambda env:
run lambda f:
f((prepare((push(env), pop(env))), peek(env))) )([])
With these helper functions in hand, we can now write anonymous functions which push and pop rather than using named arguments, like the following (note that they must take one argument at a time, but that’s fine since we can Curry any n-ary function into nested unary functions):
= lambda a: lambda b: b(a)
foo1
= run(lambda (f, v):
foo2 lambda _: f(lambda _: v(0)(v(1))))
f( )
Here foo1
is equivalent to
foo2
, but there’s an awful lot of
boilerplate. We can alleviate this if we create a simple Domain-Specific
Language (DSL) to abstract it away.
Unfortunately we don’t have built-in macros (hygenic or otherwise) or generalised Monads in Python, so I’ll make do with a direct representation of syntax trees as lists. Tuples would be nicer, but Python requires awkward postfix commas for empty and singleton tuples, which will just serve to confuse us.
Our syntax requires two main forms of expression: function definition
and function application. We’ll represent a function definition as a new
list, so [[None]]
is
like f(lambda _: f(lambda _: None))
.
We’ll represent function application by ‘consing’ values (ie. putting
them one-after-another in a list), so [a, b, c]
is equivalent to a(b)(c)
.
We can also put arbitrary Python expressions where we like, with three caveats:
- Python’s call-by-value semantics will evaluate anything outside a (Python) function definition immediately, which we can work around using lambdas.
- Python expressions are opaque to our compiler; if we want to re-enter our language, we have to explicitly call our compiler function again.
- Raw lists are swallowed by the interpreter, so they need thunking.
# A thunk-maker. Actually, this is Schoenfinkel's K combinator
= lambda x: lambda _: x
k
# A poor man's approximation to LISP's cond. Takes pairs of functions;
# the head is the test and the tail is the result if the test passes
= lambda expr, conds: filter(
cond lambda (test, _): test(expr),
+ ((k(True), k(None)))
conds 0][1](expr)
)[
# A nicer interface to Python's equality methods
= lambda x: lambda y: x == y
eq
# A nicer interface to Python's list constructor
= lambda x: lambda y: [x] + y
cons
= lambda e: run(lambda (f, v): # Our AST interpreter.
i type(e), ( # First match on the type.
cond(type([])), # A list is an AST node.
(eq(lambda _: cond(len(e), ( # Match on the length.
0), # Empty nodes are void thunks
(eq(lambda _: # ie. 'lambda _: None'.
None)), # Args are ignored, so no f.
k(1), # Singletons are functions.
(eq(lambda _: # Recurse to interpret the
0])))), # value before thunking it.
k(f(i(e[2), # Pairs are applications.
(eq(lambda _: # Recurse to interpret the
0]))( # function and argument
f(i(e[1]))), # before applying them.
i(e[True), # Length of e > 2, we have
(k(lambda _: # chained applications.
# Split off the first pair,
i(cons( 2])), # interpret them, cons them
f(i(e[:2:]))))), # back and recurse.
e[type(0)), # Natural numbers are de
(eq(lambda _: lambda _: v(e) # Bruijin indices
True), # e is not a list, so just
(k(# thunk it and we're done k(expr))))
Now we can make pure functions, much more conciseley than with Python’s regular syntax:
# Using names. A bit clunky.
= lambda _2: (
foo1 lambda _1: (
lambda _0: _2(_1, _0)
)
)
# Using de Bruijn indices. A bit clunkier, but in a very
# regular, automatable way. f and v are the function wrapper
# and value peeker, curried with a fresh environment
lambda (f, v): f(
lambda _: f(
lambda _: f(
lambda _: v(2)(v(1)(v(0)))
)
)
)
# Using de Bruijn indices and the DSL is much more concise
2, 1, 0]]]) i([[[
We can’t write closures using this scheme though, since the stack gets popped when we leave a function. In other words, this won’t work:
lambda (f, v): f(
lambda _: f(
(lambda _: f(
(lambda _: (
2).append(v(2).pop() + v(1)[-1]),
v(1).append(v(1).pop() * v(0)),
v(2)[-1] + v(1)[-1] + v(0)
v(-1]
)[2])
))([1])
))([ )
To get around this we need to use a spaghetti stack, but that requires us to stop using Python’s built-in lists. I may do that in another post but I won’t do it here, since I’ve added so much stuff already that its publication keeps getting delayed ;)
Types
Now this is all ridiculous, so what’s the point? Well, the trick is to see what happens when the assumptions are broken. We’re using functions of one variable (which is good, since it means they’re curried) and using Natural numbers to peek into a stack, known as de Bruijn indexing. But silly de Bruijn surely missed that there’s nothing stopping us from peeking too far back in a stack and running out of arguments! Oh the huge manitee!
In fact, this isn’t a bad thing at all! We’ve already come across variables which don’t exist in our argument list; they’re free variables. When all of our variables are de Bruijn indices, they’re implicitly telling us their ‘freeness’; in other words, how many wrappers they need in order to resolve.
What if we don’t supply enough wrappers, what should we do then? Well, since we can’t evaluate the variable, we should leave it unevaluated. There’s always a wraper around us, in the form of the language runtime and operating system, so this can interpret it for us. One way they can do such a thing is to treat unevaluated functions as “type constructor”; functions which cannot be evaluated away.
The simplest type constructor is for the unit value, of unit type.
This is sometimes written as ()
but in Python it is known as None
(and is of
NoneType
type). This constructor
function is ‘nullary’, it takes no arguments. If we see a unit
constructor anywhere, it is always representing a value, albeit one
that’s devoid of information.
If a nullary constructor is ever applied to an argument, that’s a type error. However, we don’t have to care in our case since we never try to evaluate the constructors.
Here’s an example of code which accepts anything and spits out a unit value; assuming we’re interpreting the first free variable to be a unit type constructor:
1) i(
Another simple type constructor is for the Boolean type, with
constructors True
and False
. Again
these are nullary; their mere presence denotes a value, although this
time there is one bit of information contained in the value. With this
said, boolean values are usually nowhere near as useful as boolean
functions, which we can define as:
# true
1]])
i([[# false
0]]) i([[
These take two arguments and spit out the first or the second; their
presence acts like an if
-then
-else
expression.
There are other constructors which take arguments, for example
tail-recursive lists. They can either be nullary nil
or
binary cons
:
# nil
1)
i(
# cons
2, 0, 1]]) i([[
I’ll wrap this up here, but will revisit some of these topics in later posts. It’s interesting to see how inter-related many topics are in Computer Science. Even in a language which takes many implementation choices away from us, like Python, a little hackery can bring them back and allow us to compose our programs in new and interesting ways.
Also, it’s always fun to mis-use any language and see what we can get away with ;)