A Whole New World
I’ve found a couple of programming language projects which introduce the concept of “Worlds” as first-class citizens, although in slightly different ways.
In the land of Object Oriented programming, ie. Smalltalk and its derivatives, the concept of worlds is used to encapsulate the state of a program. All computation happens in a world, getting the current world is a simple function call, new worlds can be spawned from other worlds (OO-style inheritance applies to the worlds) and any world can be “committed to”, ie. moving the rest of the computation into that world where it carries on. Essentially it gives programs an internal “undo” system, which can be used to try many possible execution paths safely. Some use cases are the following:
*goal*, and there are X ways we could go about doing it.
We want to do all running concurrently.
We split our computation into X new worlds, *goal*.
In each world we attempt a different way of fulfilling *goal* is achieved, all worlds except the current are deleted.
Once in whichever world managed to satisfy *goal* first. The program continues to run
This way we guarantee that we reach goal using the fastest method (since we use every method), but this is achieved in an inherently distributed way: running multiple, concurrent attempts on one CPU in one machine may often give slower results than arbitrarily choosing one attempt and waiting for it to finish, even if it’s not the fastest way possible. However, the nice thing about this use of worlds is that concurrent processing power can be utilised in an easy way without conflicts, and the thought required to understand it is very minimal; there’s no scheduling, preempting, breadth-first/depth-first tradeoffs or anything like that. Just try everything and the fastest one will win by virtue of being the fastest, then move on to the next part of the program.
Another simple way of using worlds is to deal with potential errors (ie. exceptions). This is useful since exception handling is notoriously annoying to get right. In many programming languages and paradigms, when something fails (like attempting to contact a remote server for example) then the computation stops and an “exception” gets “thrown”/“raised”, which means that whatever asked for that computation to happen is given an “exception” rather than whatever it requested. Receiving an exception is essentially a “GOTO” which jumps to whichever bit of code is assigned to deal with that type of exception (for example there might be SyntaxError, IOError, DivisionByZeroError, etc.). The problem with exceptions is that the catch-all nature of their handlers makes it difficult to determine where the exception came from. For example, we might write (in Python):
import urllib2
import time
= urllib2.urlopen('http://www.identi.ca/warbo')
page1 60)
time.sleep(= urllib2.urlopen('http://www.identi.ca/warbo')
page2 = post_difference(page1, page2)
rate print str(rate)+" posts per minute"
This downloads my Identi.ca timeline and gives it the name
page1
, then waits for a minute and does it again for
page2
. Some arbitrary function called
post_difference
is run, presumably to count the difference
in Identi.ca posts between page1
and page2
,
and the result is called rate
. We then write out a string
of rate
along with units. A problem with this is that the
download may fail, and we haven’t given any code to handle this, which
means the program will exit with an error if this happens. To prevent
this we can do:
import urllib2
import time
try:
= urllib2.urlopen('http://www.identi.ca/warbo')
page1 60)
time.sleep(= urllib2.urlopen('http://www.identi.ca/warbo')
page2 = post_difference(page1, page2)
rate print str(rate)+" posts per minute"
except:
del page1
del page2
del rate
Now if we get any type of exception in the three lines following
try:
then execution will jump to the except:
block, otherwise the except:
block will just be skipped.
Since having one page is useless without the other, and keeping such
useless objects accessible prevents the garbage collector from freeing
up the memory they take up, we use this exception block to simply delete
the page1
, page2
and rate
objects. Unfortunately there’s actually a problem with this code too!
Since we delete page1
, page2
and
rate
, and the exception might happen before those objects
are defined (for example if an exception occurs whilst trying to
download page1
, then the download of page2
and
calculation of rate
will never occur), then we might be
asking to delete things which don’t exist, which causes an exception!
The example I’ve chosen here is meant to be slightly pathological, since
the two dangerous operations we’re doing (downloading my Identi.ca
timeline) are exactly the same apart from binding name and line number.
This means we can’t, for example, have one except:
block to
deal with network errors and another to deal with some other exception
type, since they both have the same exception possibilities. There are
various ways of doing this correctly (using multiple
try:
/except:
blocks, using temporary booleans
to store whether each bit succeeded or not, etc.), but none of them are
as intuitive as the way shown above. Using worlds, however, makes things
easy, since we can encapsulate all of the side-effects of our attempts
in a child world and simply delete the world if we fail, ie. something
like the following (although I’ve made up the syntax since it doesn’t
exist for Python ;):
import urllib2
import time
= get_world().spawn()
child_world
use_world(child_world)try:
= urllib2.urlopen('http://www.identi.ca/warbo')
page1 60)
time.sleep(= urllib2.urlopen('http://www.identi.ca/warbo')
page2 = post_difference(page1, page2)
rate print str(rate)+" posts per minute"
child_world.commit()except:
del child_world
Here we know that every side-effect (including the definition of new
objects) will be contained in child_world
, so if we fail we
can simply delete it. Of course, there’s nothing to stop a language
using worlds directly for its exception handling (say by destroying
world after world until one is reached that handles the exception,
rather than just destroying stack frames), but Python couldn’t do this
as it would break its behaviour and thus existing code.
Another suggestion for using worlds is for namespaces and module systems. In Python, code from other files is made accessible to a program by using a line like:
import xyz
Which makes the contents of the module xyz
available to
the program, so that you can run, say:
import xyz
xyz.some_function(a, b, xyz.something)
The xyz
module has its own namespace, so that
everything it contains is only accessible if you put xyz.
before the name (and namespaces can be nested, so we could say
xyz.abc.something.foo()
). It’s similar to a Web site
address but using full stops instead of slashes. This makes sure that
whatever xyz
contains, it won’t mess up your program, since
you can call something a
and xyz
can call
something a
, but they won’t conflict in your program since
your a
is called a
and xyz
’s
a
is called xyz.a
. There is also namespace
injection, which takes things from a module and puts them in the
current namespace, for example:
from xyz import a, b, some_function
Now those three things we’ve imported will be available without the
xyz.
in front of them. However, this is acceptable since
we’ve explicitly defined which bits of xyz
we want, so it
should be obvious to us if this will cause any conflicts without having
to know what’s in xyz
. However, in Python there is a
dangerous and frowned-upon wildcard for namespace injection, which looks
like:
from xyz import *
This will make everything from xyz
available in your
program without having to put xyz.
before it. Since this
line doesn’t specify any names, the only way we could know what this is
going to do would be to manually read the code in xyz
or
do:
import xyz
dir(xyz)
But this would still only work for that particular version of
xyz
you’re using. Other people might be using different
versions, and future updates might change which names it uses
internally. This is often called* polluting the namespace*, since you
end up with crap that gets in the way.
By defining modules in their own worlds, we essentially get the advantage of namespaces (ie. worlds can be used as a namespace implementation), plus we can make and destroy namespaces at any time. For example:
import xyz
= xyz.some_function(12)
abc = abc+2
g
switch_world(get_world().spawn_world())import abc
abc.some_other_function(g)
destroy_world()print str(abc)
The final line will display the value we called abc
in
the second line, regardless of what importing a module called
abc
did to mess up our namespace later on.
Functionally
In the world of functional programming, there is no need to encapsulate side-effects, since all purely functional programs are side-effect free (ie. if something is true somewhere, then it is true everywhere, like regular Maths notation). The concept of worlds here is slightly different, since worlds are passed around as arguments to functions rather than acting like a stack, but the idea of encapsulating state remains the same.
In a functional language a function, let’s call it
increment_answer
, can be given another function as an
argument. This is known as a higher-order function, and as far
as the language is concerned, increment_answer
is a
perfectly valid function just waiting to act on something. To put some
tangible code for this, we can say:
:= x*x
square(x) ::= x+1
increment(x) ::= increment(f(x))
increment_answer(f, x) ::= increment_answer(square, x) square_plus_one(x) :
Even though I’ve made up this syntax, I’m sure it is pretty
straightforward. The interesting thing to note, however, is that at no
point in the above is x
ever defined. We can assume that
*
and +
are defined somewhere in this made-up
language, but x
only acts as a placeholder; it can stand
for anything and the definitions given above always be valid, no matter
what the functions do. So far so good, and if you’ve used a language
like Ruby or Python you probably know of higher order functions already.
So let’s spice things up a bit:
max(x, y) ::= x if x > y else y
:= max(0, x) allow_positive(x) :
This probably seems straightforward too: max
will give
the larger of its two arguments whilst allow_positive
will
return its argument if it’s positive, or else zero. Here we’ve been
implying numbers, however. So let’s look at what happens if we give
everything a type:
:= x*x
square(number x) :type(square) = number -> number
:= x+1
increment(number x) :type(increment) = number -> number
-> number f, number x) ::= increment(f(x))
increment_answer(number type(increment_answer) = number -> number -> number -> number
max(number x, number y) ::= x if x > y else y
type(max) = number -> number -> number
:= max(0, x)
allow_positive(number x) :type(allow_positive) = number -> number
The type notation I’ve used means that the left-hand-side is the
input and the right-hand-side is the output, so that the type
number -> number
is a function that, when given a
number
, will return a number
. Easy, yes? Well
what about the type of increment_answer
? It has two
arguments, the first is a function which must accept a
number
(since we pass it x
) and return a
number
(since increment
takes a
number
), so the type of its first argument must be a
function number -> number
. Its second argument is just a
number
. Giving it both of these will return a
number
, so its type is
number -> number -> number -> number
. That may
seem a little confusing (where does “left” end and “right” begin?), so
to consider a slightly simpler, non-higher order example let’s take a
look at max
.
There are two arrows in the type of max
, and it’s not
completely clear that both of the left ones are the inputs
x
and y
. Why have I made the confusing mistake
of putting number -> number -> number
instead of
something like (number -> number) -> number
? Well
it’s because of Currying. The type of max
is
number -> number -> number
, with the left as input
and the right as output, as I said earlier, but attempting to segregate
arguments and returns, for example with brackets, is futile: thanks to
Currying it’s completely up to you what you consider to be the left and
right! The obvious type from the function definition would be two
number
s in to get one number
out, but it’s
also valid to say that it takes a single number
in and
gives out a number -> number
. But what the hell kind of
return value is that? Well, if you look at the other types I’ve
scattered around there, it should be obvious that
number -> number
is a function which takes a
number
and returns a number
. So
max(5)
is a perfectly valid function call, and will return
a function for us! So what does such a function do? In this case it will
give the larger of its argument or 5, so we can say:
:= max(5)
five_or_above :type(five_or_above) = number -> number
12) = 12
five_or_above(3) = 5
five_or_above(type(10) = number
type(five_or_above(10)) = number
It basically acts like max
but with the first argument
‘filled in’ with a 5. This actually allows us to use a simpler
definition of our allow_positive
function, since we can
say:
:= max(0)
allow_positive :type(allow_positive) = number -> number
This will behave in exactly the same way as the previous definition.
What about the other way around? What happens if we give
max
an argument of number -> number
(ie. a
function from number
s to number
s)? Do we get
the remaining, right-most number
as a return? Not quite.
It’s perfectly valid to call max
with a function, for
example:
max(number x, number y) ::= x if x > y else y
type(max) = number -> number -> number
:= x+1
increment(number x) :type(increment) = number -> number
:= max(increment)
result :type(result) = number -> number
Hmm, we seem to have received a function back rather than a
number
. Why’s that? Well it’s because the function we gave
to max
, namely increment
(for lack of a better
example), requires an argument in order to do anything. Calling
max
of increment
only makes sense when we give
increment
something to add one to! Thus
max(increment)
gives us a function which can be thought of
as follows:
max(increment) = x if x > x+1 else x+1
type(max(increment)) = number -> number
In other words it takes a number
in for the first
argument of max
, namely x
, then sends that to
increment
to find out what to use for y
(which
will be x+1
, since that’s what increment
returns), then runs max
of x
and
y
, ie. of x
and x+1
(which
incidentally will always behave like increment
because I
used silly examples, as long as >
is defined as we would
expect of course ;) ). I know that’s quite long-winded, but re-read it a
few times and you’ll soon pick it up!
So what of the beast increment_answer
with its type
number -> number -> number -> number
? Well it’s
the same thing again: we can give it a single number
to get
a function number -> number -> number
(the same type
as max
) which essentially has the first argument ‘already
filled in’. We can give it two number
s to get a function
number -> number
which behaves as if the first two
arguments are ‘already filled in’, or we could give it a function
number -> number
which will ‘fill in’ the first argument
as required and fill in the second argument with that function applied
to the first. We can keep doing this, replacing values with functions
and functions with values, making for a program that’s applicable to a
very general set of problems. As you might be thinking, Currying is a
very powerful technique indeed! It’s one of the reasons people get so
enthused about functional programming and languages like Haskell.
OK, so that’s a little functional programming basics, let’s move on
to more interesting things than numbers! Our programs can contain
functions that do anything we want, and our entire program itself is
just a function as well. So what type do programs have?
void -> void
. This is a rather rubbish type, which
basically means they take in nothing and return nothing. What makes this
especially worse is that a large number of programs spend a lot of time
initialising themselves to start with, and backing up their calculations
before they close.
The functional idea of worlds is to make programs have the type
world -> world
, which gives them an input which they can
get started straight away with, rather than having to faff around with
loading and parsing and such (which is made all the more difficult due
to the stateless nature of the program, which is somehow accomplished by
monads even though I still can’t get my head around what they do), and
at the end they can dump their computations straight out to the
operating system for storage, rather than having to mess with IO, and
all of the complexities, edge cases and breakages that entails.
Thus in this system a program to update a clock on the desktop can be as simple as:
:= new_world_from_existing(w, "clock", w.clock+1) update_clock(world w) :
This is my own made-up syntax, but the the point is that it takes in
a world, whatever that happens to be, and outputs a new world which is
the same as the input except that its “clock” is ‘1 higher’. The point
is that this is an entirely self-contained program (given the function
new_world_from_existing
, which I only abstracted away here
since I can’t be arsed to specify a structure for these worlds, since
it’s arbitrary); the OS gives this program a world when it starts and
saves the new world it passes back when it’s done.
As far as the OS is concerned the worlds are just black boxes, it doesn’t need to know or care what they are. When an event occurs, an OS daemon will trigger the associated programs (imagine something like Unix’s cron), and these can of course trigger actions themselves (for example a clock tick can trigger a time update program, an updated time can trigger a program to render the new clock face image and having a new clock face image can trigger a screen refresh). Essentially it allows unrolling of main loops into one OS-level mechanism, and allows the programs to be short, simple, cooperative and to-the-point.
A difference between the stateful and functional world systems is that in the former, computation happens inside worlds, which capture all of the side-effects to prevent them wreaking havoc. In the latter, computation happens to worlds and new worlds are generated based on the effects of the computation; worlds are immutable and unchanging, but you can make new ones with whatever properties you like (worlds never get “replaced”, although it is valid for the operating system to pass a world to a program and store the new world that is returned in the same place as the old world, as long as no other programs use the old world, but this isn’t so much mutability or replacement as much as it is garbage collection; the old world is not needed any more so it is deleted, whilst some space is required for the new world which may as well be the space previously occupied by the old world).
However, the observant amongst you will have spotted by now where I’m going with this. If you’ve not spotted it yet then it’s that these two concepts of Worlds are the same! The advantage to using worlds in an imperative, stateful language is that side-effects can be tamed, kept isolated and the APIs kept pure and side-effect free to allow more scalability and predictability. With everything wrapped up in world sandboxes, assertions about behaviour can be made which only rely upon the individual pieces of code and the little world in which they live. In other words, this stateful code behaves like pure functions which are Curried with the worlds they return!
In a functional language, worlds allow state to be passed around, so that the same small, simple functions can be used for a variety of behaviours based on the world that they take as input. In other words these pure functions behave like imperative code, just with the state being passed around explicitly!
So what does this mean? Well firstly, if we can make pure, side-effect-free imperative code then we get the awesomeness of functional code in our non-functional languages. This, admittedly, would only occur in a coarse-grained way, since there’s no guarantee that anything below the API level behaves purely, but it’s enough to let us build non-leaky abstractions in the way that functional languages get for free. If we have correct abstractions then we can treat them as black boxes and do all sorts of cool analysis on them, like automatic translation, optimisation, compilation, program synthesis, inter-language calls, etc.
Some of this is going on in the Viewpoints Research Institute’s STEPS project, with their kernel abstractions, essentially treating function calls like virtual machines, implemented as on-the-fly domain-specific language compilation in separate worlds, which makes each layer in the call stack become more sandboxed, more secure, more abstract, simpler and more specialised at its job. It’s an interesting project with some cool papers floating around (although I wish they’d release their publications more often ;) )
Anyhoo, I thought I’d write a blog entry to test out KDE’s Blogilo program, since Google’s fail Web app POS doesn’t like me, and I’m getting rather carried away with it :) Hope this works!