Nat-like Types, or Dynamic and Static Information

Posted on by Chris Warburton

This is based on an explanation I gave on the Idris mailing list about using dependent types to prove that a Vector contains some element.

The Skeleton

All of the types in this post will be variations of the Nat type. If you've not encountered it before, take a look at Peano arithmetic.

Nat

Nat has two constructors:

Z : Nat
S : Nat -> Nat

These can be combined in an infinite number of ways: Z, S Z, S (S Z), etc.

From a static (typing) point of view, Nat doesn't tell us very much at all, since it can always be trivially satisfied with Z (Z is a lot like () or NULL).

From a dynamic (value) point of view, the only thing a Nat tells us is how many S constructors it has.

Hence, we can think of Nat as being the Natural numbers (the 'counting numbers', where we're counting how many S wrappers there are around Z).

We can define types similar to Nat which are more informative, both statically (stronger types) and dynamically (contain more data).

Static Information

These types start with Nat and add static information (ie. stronger types).

Exactly n

We can define a strange type Exactly n which is like Nat, except it stores a Nat in its type too:

data Exactly : Nat -> Type where
  eZ : Exactly Z
  eS : Exactly n -> Exactly (S n)

Dynamically, an Exactly n value is like a Nat: all we can use it for is counting. We can see this if we throw away the static information:

plain : Exactly n -> Nat
plain eZ = Z
plain (eS n) = S (plain n)

However, we have more static information: an Exactly n value will have n occurences of the eS constructor, hence it guarantees the following:

guarantee : n -> Exactly n -> Bool
guarantee n e = plain e == n

In other words, Exactly n is like 'all Naturals which are equal to n'.

Fin t

We can change our constructors slightly to make a useful type called Fin n, which is a lot like Exactly n except for the guarantee that it provides. It has two constructors, just like Nat and Exactly n:

fZ : Fin (S n)
fS : Fin n -> Fin (S n)

Once again, we can throw away the type information to get back to Nat:

plain : Fin n -> Nat
plain  fZ    = Z
plain (fS n) = S (plain n)

Hence, dynamically a Fin n is also just a counter like Nat.

Statically, the guarantee we get from Fin n is that it has exactly n possible values (try it for n = 0, 1, 2, ... and see!).

The extra type information guarantees that:

fin : n -> Fin n -> Bool
fin n f = plain f < n

Hence we can think of Fin n as being like 'the Naturals up to n'. For more info, see this informative StackOverflow answer

NatLTE n m

NatLTE n m is like Exactly n but it stores two Nat arguments in its type. Again, its constructors follow the same basic pattern as Nat:

nEqn : NatLTE n n
nLTESm : NatLTE n m -> NatLTE n (S m)

We can throw away the type info to get:

plain : NatLTE n m -> Nat
plain nEqn = Z
plain (nLTESm n) = S (plain n)

Notice that again, dynamically we're just left with a counter. However, we have some static guarantees about the value of that counter.

Notice that the two Nat values in our type must start out the same, although unlike eZ they can start at any value. Just like eS and fS, we add an S with the nLTESm wrapper, although we also have another Nat which gets passed along unchanged.

The guarantee for NatLTE n m is:

guarantee : n -> m -> NatLTE n m -> Bool
guarantee n m x = n + (plain x) == m

Hence we can think of NatLTE n m as being 'Nats which, when added to n, produce m' or equivalently 'proof that n is less than or equal to m'.

Elem x xs

Elem x xs is like NatLTE n m except its type can contain more than just Nat values. Again, its constructors follow the same basic pattern as Nat:

Here : Elem x (x::xs)
There : Elem x xs -> Elem x (y::xs)

We can throw away the type information to get a Nat:

plain : Elem x xs -> Nat
plain Here = Z
plain (There n) = S (plain n)

Again, dynamically a value of Elem x xs is only useful as a counter. The static guarantee we get is the following:

guarantee : x -> xs -> Elem x xs -> Bool
guarantee x xs e = index' (plain e) xs == Just x

It does this by using Here : Elem x xs to tells us that x is the head of xs, whilst There tells us that x appears somewhere in the tail of xs.

Hence we can think of Elem x xs as being 'proof that x is somewhere in xs'.

We get compile errors if we change the Here and There values in a program, because we're changing the proof. Since the proof tells Idris what the index of the element is, changing that index makes the proof wrong and the compiler rejects it.

Dynamic Information

These types start with Nat and add dynamic information (ie. data).

List t

List t follows the same constructor pattern as Nat:

Nil : List t
Cons x : List t -> List t

The argument to Cons (the x) adds dynamic data to the basic "count the constructors" information found in Nat. Note that we gain no static information, regardless of what t is, since we might always be given a Nil.

We can throw away the dynamic information to get a Nat:

plain : List t -> Nat
plain Nil = Z
plain (Cons _ xs) = S (plain xs)

In the same way that we went from Nat to Fin n, you might try making a new type FinList n t for 'all lists of t up to length n'.

In the same way that we went from Fin n to NatLTE n m, you might try making a new type SuffixList l1 l2 for 'proofs that l1 is a suffix of l2'.

Delay t

Delay t is like List t, except it can only store data in the Nil constructor, not the Cons constructor. For this reason, we rename the constructors to Now and Later, respectively:

Now x : Delay t
Later : Delay t -> Delay t

If we throw away the dynamic information, we get a counter like Nat:

plain : Delay t -> Nat
plain (Now   x) = Z
plain (Later x) = S (plain x)

What about the static information? List t was trivial for all t, since we can satisfy it with Nil. We can't do that with Delay t, since Now requires an argument of type t.

In fact, we can satisfy Delay t trivially, by using Later:

loop : Delay t
loop = Later loop

This defines a never-ending chain of Later (Later (Later (Later ...))). Since we never terminate the chain with a Now, we don't have to provide a value of type t.

This kind of circular reasoning is called co-induction, and is logically sound as long as each constructor can be determined by a terminating function (in this case, we just step along the chain until we reach the desired depth, pattern-matching the (constant) constructors as we go). This condition is called co-termination.

If a function doesn't terminate or co-terminate, it diverges. We can use Delay t to handle diverging functions in a safe way. For example, the Halting Problem tells us that it's impossible to calculate whether a Universal Turing Machine will halt for arbitrary inputs; ie. the UTM might diverge. However, we can always perform each step of the UTM in a finite amount of time.

If we wrap the result of our UTM in a Delay, we can return a Later for each step, and if it does eventually halt we can return a Now. Whenever it diverges, our program doesn't freeze; instead, we get the same Later (Later (Later ...)) chain as loop, which we can inspect to various depths.

Vect n t

Vect n t itself is a bit like FinList n t, except instead of containing all lists up to length n, it contains only lists of length n. It follows the same constructor pattern as Nat and List t:

Nil : Vect Z t
Cons x : Vect n t -> Vect (S n) t

Notice that the Nat in the type of Nil is forced to be Z, instead of being S n for all n like in Fin (hint, that's the difference between Vect and FinList ;) )

If we throw away the static information of Vect n t we get a List t:

plain : Vect n t -> List t
plain Nil = Nil
plain (Cons x xs) = Cons x (plain xs)

If we throw away the dynamic information, we get an Exactly n:

plain : Vector n t -> Exactly n
plain Nil = eZ
plain (Cons _ xs) = eS (plain xs)

Conclusion

We can add dynamic information (values) to a type by adding arguments to the constructors; this takes us, for example, from Nat to List t.

We can also add static information to a type by adding arguments to the type ("strengthening" it), and use the constructors to guarantee some invariant between all these values (ie. we don't provide a way to construct an invalid value).

The simple pattern of two constructors: one which gives a Foo and another which turns a Foo into a Foo, is very powerful, and can represent all kinds of 'linear' things, including counting, lists, bounds, linear relationships, lengths, indexes, etc.