Calculating Arity
The arity of a value is the number of arguments it can accept. We can define arity by pattern-matching on types (a feature known as typecase):
I -> O) = 1 + arity O
arity (T = 0 arity
However, it’s difficult to implement this kind of type-level function. Here are some approaches I tried in Haskell.
Type Classes
We want arity
to have a
different value, depending on the type we’re using. We can do this by
overloading arity
using
type classes:
class HasArity a where
arity :: Int
However, we hit problems if we try to implement this type class:
instance (HasArity o) => HasArity (i -> o) where
= 1 + arity
arity
instance HasArity t where
= 0 arity
There are two issues here. Firstly, we don’t know which arity
to use in the expression 1 + arity
:
/build/runghcXXXX29-0.hs:2:3: error: [GHC-39999]
• Could not deduce ‘HasArity a0’
from the context: HasArity a
bound by the type signature for:
arity :: forall {k} (a :: k). HasArity a => Int
at /build/runghcXXXX29-0.hs:2:3-14
The type variable ‘a0’ is ambiguous
• In the ambiguity check for ‘arity’
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
When checking the class method:
arity :: forall {k} (a :: k). HasArity a => Int
In the class declaration for ‘HasArity’
|
2 | arity :: Int
| ^^^^^^^^^^^^
We can fix this by turning arity
into a function:
class HasArity a where
arity :: a -> Int
instance (HasArity o) => HasArity (i -> o) where
-- undefined inhabits every type, including i
= 1 + arity (f undefined)
arity f
instance HasArity t where
= 0 arity _
The other problem is that these two instances overlap; the first is just a special case of the second:
= print (arity not) main
/build/runghcXXXX49-0.hs:10:15: error: [GHC-43085]
• Overlapping instances for HasArity (Bool -> Bool)
arising from a use of ‘arity’
Matching instances:
instance HasArity t -- Defined at /build/runghcXXXX49-0.hs:8:10
instance HasArity o => HasArity (i -> o)
-- Defined at /build/runghcXXXX49-0.hs:4:10
• In the first argument of ‘print’, namely ‘(arity not)’
In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
10 | main = print (arity not)
| ^^^^^
Unlike pattern-matching, which tries alternatives from top to bottom, type classes don’t have such an ordering. This makes it difficult to implement a “catch-all” instance which works for any type other than functions.
Multi-Parameter Type Classes
The classic way to implement type-level computation in Haskell is to
use multi-parameter type classes as a form of logic programming. Similar
to Prolog, to represent a total function of the form arity x = a
,
we instead define a partial relation of the form arity x a
. We can then implement that
relation as a multi-parameter type class, relating the input type to its
arity:
class HasArity a n where
arity :: a -> n
Unfortunately, Haskell doesn’t have dependent types, so we can’t represent arities with regular numbers:
instance (HasArity o n) => HasArity (i -> o) (1 + n) where
= 1 + arity (f undefined)
arity f
instance HasArity t 0 where
= 0 arity _
/build/runghcXXXX69-0.hs:3:47: error:
Illegal type: ‘1’ Perhaps you intended to use DataKinds
|
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
| ^
/build/runghcXXXX69-0.hs:3:49: error: [GHC-76037]
Not in scope: type constructor or class ‘+’
|
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
| ^
/build/runghcXXXX69-0.hs:6:21: error:
Illegal type: ‘0’ Perhaps you intended to use DataKinds
|
6 | instance HasArity t 0 where
| ^
Type-level programming in Haskell can only involve types,
not regular values like numbers, so we need some type-level notion of
numbers instead. In other words, an arity n
is both a type and a
number; it’s not a “numeric type”, like Int
, since
Int
itself is not a number; likewise it’s not an element of a numeric type
like Int
, like
3 :: Int
,
since Int
s are not
types. We need a new notion, which satisfies both of these
requirements:
-- Type level Peano numerals
data Zero = Z
data Succ n = S n
To make life easier, we provide a toInt
function
to turn these Peano numbers into regular Int
s:
class ToInt a where
toInt :: a -> Int
instance ToInt Zero where
toInt _ = 0
instance (ToInt n) => ToInt (Succ n) where
toInt (S x) = 1 + toInt x
Now we can return these Peano numbers from our arity
functions:
instance (HasArity o n) => HasArity (i -> o) (Succ n) where
= S (arity (f undefined))
arity f
instance HasArity t Zero where
= Z arity _
This gets us part of the way there; our arities are now living at the same level as our types, but now we have a couple of problems regarding ambiguity:
= print (arity not) main
/build/runghcXXXX91-0.hs:19:8: error: [GHC-39999]
• Ambiguous type variable ‘a0’ arising from a use of ‘print’
prevents the constraint ‘(Show a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
Potentially matching instances:
instance Show Ordering -- Defined in ‘GHC.Show’
instance Show a => Show (Maybe a) -- Defined in ‘GHC.Show’
...plus 25 others
...plus 13 instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
• In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
19 | main = print (arity not)
| ^^^^^
/build/runghcXXXX91-0.hs:19:15: error: [GHC-39999]
• Ambiguous type variable ‘a0’ arising from a use of ‘arity’
prevents the constraint ‘(HasArity
(Bool -> Bool) a0)’ from being solved.
(maybe you haven't applied a function to enough arguments?)
Probable fix: use a type annotation to specify what ‘a0’ should be.
Potentially matching instances:
instance HasArity t Zero
-- Defined at /build/runghcXXXX91-0.hs:17:10
instance HasArity o n => HasArity (i -> o) (Succ n)
-- Defined at /build/runghcXXXX91-0.hs:14:10
• In the first argument of ‘print’, namely ‘(arity not)’
In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
19 | main = print (arity not)
| ^^^^^
Functional Dependencies
These ambiguities are due to the open nature of type
classes: we can declare new instances at any point, so the fact that we
just-so-happen to have instances with Zero
and Succ n
as the
second parameter doesn’t actually tell Haskell very much; it might come
across other definitions elsewhere which use other types as the second
parameter, and hence there is ambiguity.
To make the second parameter unambiguous we can use functional dependencies to indicate that the second parameter is completely determined by the first one:
class HasArity a n | a -> n where
arity :: a -> n
Now we can make our two instances:
instance (HasArity o n) => HasArity (i -> o) (Succ n) where
= S (arity (f undefined))
arity f
instance HasArity t Zero where
= Z arity _
However, we still have an ambiguity problem; the “catch-all”
parameter t
of the second
instance overlaps with the more-specific function type i -> o
of
the first:
/build/runghcXXXX111-0.hs:14:10: error: [GHC-46208]
Functional dependencies conflict between instance declarations:
instance HasArity o n => HasArity (i -> o) (Succ n)
-- Defined at /build/runghcXXXX111-0.hs:14:10
instance HasArity t Zero
-- Defined at /build/runghcXXXX111-0.hs:17:10
|
14 | instance (HasArity o n) => HasArity (i -> o) (Succ n) where
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Type Families
Type families are the solution we’ve been looking for. I originally gave up trying to understand these, after banging my head against GHC for a while. I was recently inspired to have another go, after attending a talk by Lennart Augustsson which involved some pretty hairy type-level hackery. During the course of his presentation, type families finally clicked for me, and I was able to bang out the following approach:
type family Arity t :: * where
Arity (i -> o) = Succ (Arity o)
Arity t = Zero
This defines Arity
as a
closed type family. A type family is basically a type-level
function (unlike the type-level relations encoded by
multiparameter typeclasses). Type families can be open,
allowing new pattern clauses to be added at any point, provided they
don’t overlap (just like typeclass instantiation); alternatively they
can be closed, where a fixed set of, potentially overlapping,
pattern clauses are used (just like function definition). We use the
latter, to allow our “catch-all” pattern to overlap with the i -> o
pattern.
With this type family, we can finally get the arity of each type.
However, one problem still remains: the result will only be accessible
at the type-level. To associate a value with each Arity
type, we
can use type classes; we no longer have problems with ambiguity, since
we can treat all arities in the same way.
First we use the fact that our type-level numbers are
singletons, i.e. they only contain one value (ignoring undefined
). We
use a type class to associate the name singleton :: a
with the single value of type a
:
class Singleton a where
singleton :: a
instance Singleton Zero where
= Z
singleton
instance (Singleton n) => Singleton (Succ n) where
= S singleton singleton
We can use the Singleton
class to write our arity
function. All of our calculation is performed by Arity
at the
type level, which results in either Zero
or Succ n
for
some n
; we use singleton
to access the (only) value
of that result type:
arity :: (Singleton (Arity a)) => a -> Arity a
= singleton
arity _
-- Handy helper functions
-- Return the arity of a value, as an Int
intArity :: (Singleton (Arity a), ToInt (Arity a)) => a -> Int
= toInt . arity
intArity
-- Pretty output
= concat ["Arity of '", s, "' is ", show (intArity x)] format s x
Now, at last, we can get the arity of values:
= mapM_ putStrLn [
main "Bool -> Bool" not
format "String" "hello"
, format "Bool -> Bool -> Bool" (&&)
, format "(a -> b) -> [a] -> [b]" map
, format ]
Arity of 'Bool -> Bool' is 1
Arity of 'String' is 0
Arity of 'Bool -> Bool -> Bool' is 2
Arity of '(a -> b) -> [a] -> [b]' is 2
It’s also interesting to note the values which don’t work.
For example, we fail to get the arity of the identity function id :: a -> a
:
= putStrLn (format "a -> a" id) main
/build/runghcXXXX156-0.hs:34:18: error: [GHC-39999]
• Ambiguous type variable ‘a0’ arising from a use of ‘format’
prevents the constraint ‘(Singleton (Arity a0))’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
Potentially matching instances:
instance Singleton n => Singleton (Succ n)
-- Defined at /build/runghcXXXX156-0.hs:21:10
instance Singleton Zero
-- Defined at /build/runghcXXXX156-0.hs:18:10
• In the first argument of ‘putStrLn’, namely
‘(format "a -> a" id)’
In the expression: putStrLn (format "a -> a" id)
In an equation for ‘main’: main = putStrLn (format "a -> a" id)
|
34 | main = putStrLn (format "a -> a" id)
| ^^^^^^
This is because the output type of id
is
completely polymorphic: it could be anything, including a
function. We know the arity must be at least 1, but it
could be arbitrarily higher; we don’t have enough information to know,
and hence the compiler can’t finish the calculation.
To get around this, we can restrict the output type of id
. We don’t
have to completely monomorphise it, we just need to provide enough
information to perform the type-level pattern-match for Arity
:
= mapM_ putStrLn [
main "Bool -> Bool " (id :: Bool -> Bool)
format "[a] -> [a] " (id :: [a] -> [a])
, format "Maybe a -> Maybe a " (id :: Maybe a -> Maybe a)
, format "(a -> [b]) -> (a -> [b])" (id :: (a -> [b]) -> (a -> [b]))
, format ]
Arity of 'Bool -> Bool ' is 1
Arity of '[a] -> [a] ' is 1
Arity of 'Maybe a -> Maybe a ' is 1
Arity of '(a -> [b]) -> (a -> [b])' is 2