HoTT for Dummies
Based on a Hacker News thread, about a talk by Thorsten Altenkirch, I ended up writing a rather long explanation of some Type Theory and Homotopy Type Theory, which I thought I might as well reproduce here:
Let’s say you’re creating a new programming language. You think it’s a good idea to use static types, but which ones? Well, everyone uses booleans, so let’s include them:
True : Boolean
False : Boolean
(Foo : Bar
means “Foo
has type
Bar
”, or
“Foo
is
a Bar
”).
OK, that seemed pretty easy. But wait, you’ve not given a type to Boolean
. For
the sake of completeness:
Boolean : Type
Uh oh, now you need to give a type to Type
. What
should it be? It turns out (via Girard’s
paradox) that simply saying Type : Type
would make things inconsistent, ie. we would be able to trick the
compiler into accepting incorrect programs.
Instead, we use a series of “levels”:
Boolean : Type 0
Type n : Type (n+1)
So far so good. Now let’s say we want function types, for example:
: Boolean -> Boolean
identity = x
identity x
not : Boolean -> Boolean
not True = False
not False = True
So what’s this ->
thing?
We can think of it as a type-level operator: it takes two types and
returns a function type. In the syntax of natural
deduction, we can say:
: Type n b : Type m
a -----------------------------
-> b : Type (1 + max n m) a
ie. given a
of type Type n
, and
b
of type Type m
, then
a -> b
has
type Type (1 + max n m)
.
Because the type a -> b
somehow ‘contains’ the types a
and b
, we need to ensure it’s at
a higher level than either of them, which is why we do 1 + max n m
.
In fact, there’s no reason for the identity
function to only work on
Boolean
s. We
can replace it with an “identity function factory”, which accepts a type
and returns an identity function for that type:
: (t : Type n) -> t -> t
identity = y identity x y
Here we’ve re-used the foo : bar
notation: rather than just giving the type of the first argument, we’ve
also introduced a variable t
representing its value (this is known as a dependent function). Notice
that the definition of identity
actually ignores the type it’s been given (x
); the implementation doesn’t care
what it is, it’ll just return the second argument no matter what; yet we
need that argument in order to type-check. When we compile this program,
we can “erase” the first argument, since it has no “computational
content”.
We can recover our old identity
function, of type Boolean -> Boolean
,
by applying this new identity
function to the Boolean
type:
Boolean : Boolean -> Boolean identity
OK, what next? Well, since we have functions, we might as well have function composition:
: (a : Type x) -> (b : Type y) -> (c : Type z) -> (b -> c) -> (a -> b) -> a -> c
compose = f (g x) compose t1 t2 t3 g f x
It would also be useful to have equality. Here it is for Boolean
s:
: Boolean -> Boolean -> Boolean
equal True x = x
equal False x = not x equal
However, just like the identity
function, this isn’t very
satisfying. We’d like an “equality function factory”, with this
type:
: (t : Type n) -> t -> t -> Boolean equal
Except, how would we ever implement such a “factory”? It was easy for
identity
: we just return
whatever we’re given. In the case of equality, we need to inspect our
arguments, to see whether they’re actually equal or not. We can’t do
this in a way which works for all types (eg. what if we allow
user-defined types?).
However, there’s a trick. By returning a Boolean
, we’re
defining equality (True
)
and disequality (False
). That’s
hard. Instead, we can ignore the disequality, and only focus on
equality, using a different return type; let’s call it Equal x y
.
What does it mean for two things to be equal? It means that they’re the same thing. In which case, we don’t need both of them! Every value is equal to itself (a property known as “reflexivity”), so that’s all we need!
: (t : Type n) -> (x : t) -> Equal x x refl
For example, here’s equality for the Boolean
s:
Boolean True : Equal True True
refl Boolean False : Equal False False refl
Note that refl
isn’t actually
a function, it’s a data constructor. You can think of a value like refl Boolean True
as being a piece of data, similar to something like pair Int String 10 'foo'
;
it doesn’t reduce to anything, it just gets passed around as-is. (These
are often called “proof objects”, but that’s a bit arbitrary; a value
like pair Int String 10 'foo'
is a “proof” of “Int
AND String
”).
If we allow computation in our types, then two different values which compute (technically: beta reduce) to the same thing are still equal by reflexivity:
Boolean True : Equal True (not False) refl
Here, the not False
will compute to True
, and
refl
will type-check. Different
values, eg. False
and
True
,
are never equal, since they don’t reduce to the same thing. We can make
our computations as complex as we like, for example:
Boolean True : Equal (identity Boolean True) (compose Boolean Boolean Boolean not not True) refl
We can even have equality between functions and equality between types:
-- "identity" is equal to "identity"
: Type n) -> t -> t) identity : Equal identity identity
refl ((t
-- "Boolean" is equal to "Boolean"
Type 0) Boolean : Equal Boolean Boolean refl (
We can even have equalities between equalities!
-- "refl Boolean True" is equal to "refl Boolean True"
Equal True True) (refl Boolean True) (refl Boolean True) : Equal (refl Boolean True) (refl Boolean True) refl (
Most of this predates Homotopy Type Theory, so what are the points being made in the slides?
One point is to give a topological perspective for types: a type is
like a space, values in the type are like points in the space.
Equalities between values are paths in the space (eg. refl Boolean True
is a trivial path from the point True
to
itself, in the Boolean
space). Interestingly, equalities between equalities are homotopies
(smooth transformations between paths).
One question we might ask is whether all equality values are the
same; ie. are they all just refl
? That’s known as the “Uniqueness
of Identity Proofs” (UIP), and it’s an assumption that many people have
been making for decades. However, if we think of equalities as paths
through a space, then UIP says that all those paths can be transformed
into each other. Yet that’s not the case if the space contains a hole!
Consider two paths going from a point X
back to itself; if
one of those paths loops around a hole, and the other doesn’t, then
there’s no way to smoothly transform between the two (without “cutting
and sticking”):
The topological perspective also gives us some intuition about the
“levels” of types: Type 0
contains spaces with distinct points, eg. Boolean
containing True
and False
. These
are essentially sets, from Set Theory. Although HoTT doesn’t assume UIP
for all types, those which do just-so-happen to “collapse” down
to one value (ie. there are equalities between every point) actually
occupy a level below sets; ie. they end up at Type -1
(there’s no significance to the negative number; it’s just a historical
accident caused by definitions like Boolean : Type 0
).
Likewise, those which add more structure occupy higher levels.
One important question is how function equality behaves. It’s useful
to have equality for, say, Boolean
s,
since we can compute their value and check whether they’re the same.
Functions are trickier: we have intensional equality (eg. Equal identity identity
)
but we’d like extensional equality (eg. that identity
and not . not
are equal, ie. Equal (identity Boolean) (compose Boolean Boolean Boolean not not)
).
This is tricky. If we assume extensionality as an axiom (ie. we assume a
built-in function of type ((x : t) -> Equal (f x) (g x)) -> Equal f g
),
like in NuPRL, then we lose the ability to execute our programs (axioms
are basically “built in primitives”; we don’t know how to implement such
an extensionality primitive). Observational Type Theory gets us most of
the way there, but relies on UIP, which would collapse all of the
higher-level structure in HoTT.
Univalence is also a nice feature of HoTT. It’s incompatible with
UIP, but allows us to reason “up to isomorphism”. For example, in set
theory the sets {foo, bar}
and {x, y}
are
isomorphic: eg. we can switch foo
with x
and
bar
with y
. However, results concerning
{foo, bar}
may be invalid for {x, y}
, since
set theory lets us say things like “is foo
a member of
S
?”, which is true
when
S = {foo, bar}
but false
when
S = {x, y}
. HoTT doesn’t let us say things like “is
foo
a member of type T
?”, hence we don’t get
these kind of “abstraction leaks”, so our programs and proofs can be
automatically lifted from one representation to another. For example, if
you define a fast, distributed datastructure which is isomorphic to a
linked-list, then you can automatically lift every linked-list program
to your new datastructure using univalence. Unfortunately, we haven’t
figured out how to implement univalence yet (but many people think it’s
possible).
Some of the results listed at the end of the slides concern the use of HoTT to prove topological results, eg. that the integers are isomorphic to paths around a hole (you can go around the hole any number of times, clockwise or anticlockwise).