# Calculating Arity

Posted on by Chris Warburton

The arity of a value is the number of arguments it can accept. We can define arity by pattern-matching on types (a feature known as typecase):

``````arity (I -> O) = 1 + arity O
arity  T       = 0``````

However, it's difficult to implement this kind of type-level function. Here are some approaches I tried in Haskell.

# Type Classes

We want `arity` to have a different value, depending on the type we're using. We can do this by overloading `arity` using type classes:

``````class HasArity a where
arity :: Int``````

However, we hit problems if we try to implement this type class:

``````instance (HasArity o) => HasArity (i -> o) where
arity = 1 + arity

instance HasArity t where
arity = 0``````

There are two issues here. Firstly, we don't know which `arity` to use in the expression `1 + arity`:

``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25587-0.hs:2:3: error:
• Could not deduce (HasArity a0)
from the context: HasArity a
bound by the type signature for:
arity :: forall a. HasArity a => Int
at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25587-0.hs:2:3-14
The type variable ‘a0’ is ambiguous
• In the ambiguity check for ‘arity’
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
When checking the class method:
arity :: forall a. HasArity a => Int
In the class declaration for ‘HasArity’
|
2 |   arity :: Int
|   ^^^^^^^^^^^^

/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25587-0.hs:6:10: error:
• Illegal instance declaration for ‘HasArity t’
(All instance types must be of the form (T a1 ... an)
where a1 ... an are *distinct type variables*,
and each type variable appears at most once in the instance head.
Use FlexibleInstances if you want to disable this.)
• In the instance declaration for ‘HasArity t’
|
6 | instance HasArity t where
|          ^^^^^^^^^^
``````

We can fix this by turning `arity` into a function:

``````class HasArity a where
arity :: a -> Int

instance (HasArity o) => HasArity (i -> o) where
-- undefined inhabits every type, including i
arity f = 1 + arity (f undefined)

instance HasArity t where
arity _ = 0``````

The other problem is that these two instances overlap; the first is just a special case of the second:

``main = print (arity not)``
``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25612-0.hs:10:15: error:
• Overlapping instances for HasArity (Bool -> Bool)
arising from a use of ‘arity’
Matching instances:
instance HasArity t
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25612-0.hs:8:10
instance HasArity o => HasArity (i -> o)
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25612-0.hs:4:10
• In the first argument of ‘print’, namely ‘(arity not)’
In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
10 | main = print (arity not)
|               ^^^^^^^^^
``````

Unlike pattern-matching, which tries alternatives from top to bottom, type classes don't have such an ordering. This makes it difficult to implement a "catch-all" instance which works for any type other than functions.

# Multi-Parameter Type Classes

The classic way to implement type-level computation in Haskell is to use multi-parameter type classes as a form of logic programming. Similar to Prolog, to represent a total function of the form `arity x = a`, we instead define a partial relation of the form `arity x a`. We can then implement that relation as a multi-parameter type class, relating the input type to its arity:

``````class HasArity a n where
arity :: a -> n``````

Unfortunately, Haskell doesn't have dependent types, so we can't represent arities with regular numbers:

``````instance (HasArity o n) => HasArity (i -> o) (1 + n) where
arity f = 1 + arity (f undefined)

instance HasArity t 0 where
arity _ = 0``````
``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25640-0.hs:3:47: error:
Illegal type: ‘1’ Perhaps you intended to use DataKinds
|
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
|                                               ^

/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25640-0.hs:3:47: error:
Not in scope: type constructor or class ‘+’
|
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
|                                               ^^^^^

/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25640-0.hs:3:47: error:
Illegal operator ‘+’ in type ‘1 + n’
Use TypeOperators to allow operators in types
|
3 | instance (HasArity o n) => HasArity (i -> o) (1 + n) where
|                                               ^^^^^

/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25640-0.hs:6:21: error:
Illegal type: ‘0’ Perhaps you intended to use DataKinds
|
6 | instance HasArity t 0 where
|                     ^
``````

Type-level programming in Haskell can only involve types, not regular values like numbers, so we need some type-level notion of numbers instead. In other words, an arity `n` is both a type and a number; it's not a "numeric type", like `Int`, since `Int` itself is not a number; likewise it's not an element of a numeric type like `Int`, like `3 :: Int`, since `Int`s are not types. We need a new notion, which satisfies both of these requirements:

``````-- Type level Peano numerals
data Zero   = Z
data Succ n = S n``````

To make life easier, we provide a `toInt` function to turn these Peano numbers into regular `Int`s:

``````class ToInt a where
toInt :: a -> Int

instance ToInt Zero where
toInt _ = 0

instance (ToInt n) => ToInt (Succ n) where
toInt (S x) = 1 + toInt x``````

Now we can return these Peano numbers from our `arity` functions:

``````instance (HasArity o n) => HasArity (i -> o) (Succ n) where
arity f = S (arity (f undefined))

instance HasArity t Zero where
arity _ = Z``````

This gets us part of the way there; our arities are now living at the same level as our types, but now we have a couple of problems regarding ambiguity:

``main = print (arity not)``
``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25788-0.hs:19:8: error:
• Ambiguous type variable ‘a0’ arising from a use of ‘print’
prevents the constraint ‘(Show a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instances exist:
instance Show Ordering -- Defined in ‘GHC.Show’
instance Show Integer -- Defined in ‘GHC.Show’
instance Show a => Show (Maybe a) -- Defined in ‘GHC.Show’
...plus 22 others
...plus 11 instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
• In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
19 | main = print (arity not)
|        ^^^^^^^^^^^^^^^^^

/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25788-0.hs:19:15: error:
• Ambiguous type variable ‘a0’ arising from a use of ‘arity’
prevents the constraint ‘(HasArity
(Bool -> Bool) a0)’ from being solved.
(maybe you haven't applied a function to enough arguments?)
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instances exist:
instance HasArity t Zero
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25788-0.hs:17:10
instance HasArity o n => HasArity (i -> o) (Succ n)
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX25788-0.hs:14:10
• In the first argument of ‘print’, namely ‘(arity not)’
In the expression: print (arity not)
In an equation for ‘main’: main = print (arity not)
|
19 | main = print (arity not)
|               ^^^^^^^^^
``````

# Functional Dependencies

These ambiguities are due to the open nature of type classes: we can declare new instances at any point, so the fact that we just-so-happen to have instances with `Zero` and `Succ n` as the second parameter doesn't actually tell Haskell very much; it might come across other definitions elsewhere which use other types as the second parameter, and hence there is ambiguity.

To make the second parameter unambiguous we can use functional dependencies to indicate that the second parameter is completely determined by the first one:

``````class HasArity a n | a -> n where
arity :: a -> n``````

Now we can make our two instances:

``````instance (HasArity o n) => HasArity (i -> o) (Succ n) where
arity f = S (arity (f undefined))

instance HasArity t Zero where
arity _ = Z``````

However, we still have an ambiguity problem; the "catch-all" parameter `t` of the second instance overlaps with the more-specific function type `i -> o` of the first:

``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26268-0.hs:14:10: error:
Functional dependencies conflict between instance declarations:
instance HasArity o n => HasArity (i -> o) (Succ n)
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26268-0.hs:14:10
instance HasArity t Zero
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26268-0.hs:17:10
|
14 | instance (HasArity o n) => HasArity (i -> o) (Succ n) where
|          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
``````

# Type Families

Type families are the solution we've been looking for. I originally gave up trying to understand these, after banging my head against GHC for a while. I was recently inspired to have another go, after attending a talk by Lennart Augustsson which involved some pretty hairy type-level hackery. During the course of his presentation, type families finally clicked for me, and I was able to bang out the following approach:

``````type family Arity t :: * where
Arity (i -> o) = Succ (Arity o)
Arity t        = Zero``````

This defines `Arity` as a closed type family. A type family is basically a type-level function (unlike the type-level relations encoded by multiparameter typeclasses). Type families can be open, allowing new pattern clauses to be added at any point, provided they don't overlap (just like typeclass instantiation); alternatively they can be closed, where a fixed set of, potentially overlapping, pattern clauses are used (just like function definition). We use the latter, to allow our "catch-all" pattern to overlap with the `i -> o` pattern.

With this type family, we can finally get the arity of each type. However, one problem still remains: the result will only be accessible at the type-level. To associate a value with each `Arity` type, we can use type classes; we no longer have problems with ambiguity, since we can treat all arities in the same way.

First we use the fact that our type-level numbers are singletons, i.e. they only contain one value (ignoring `undefined`). We use a type class to associate the name `singleton :: a` with the single value of type `a`:

``````class Singleton a where
singleton :: a

instance Singleton Zero where
singleton = Z

instance (Singleton n) => Singleton (Succ n) where
singleton = S singleton``````

We can use the `Singleton` class to write our `arity` function. All of our calculation is performed by `Arity` at the type level, which results in either `Zero` or `Succ n` for some `n`; we use `singleton` to access the (only) value of that result type:

``````arity :: (Singleton (Arity a)) => a -> Arity a
arity _ = singleton

-- Handy helper functions

-- Return the arity of a value, as an Int
intArity :: (Singleton (Arity a), ToInt (Arity a)) => a -> Int
intArity = toInt . arity

-- Pretty output
format s x = concat ["Arity of '", s, "' is ", show (intArity x)]``````

Now, at last, we can get the arity of values:

``````main = mapM_ putStrLn [
format "Bool -> Bool"           not
, format "String"                 "hello"
, format "Bool -> Bool -> Bool"   (&&)
, format "(a -> b) -> [a] -> [b]" map
]``````
``````Arity of 'Bool -> Bool' is 1
Arity of 'String' is 0
Arity of 'Bool -> Bool -> Bool' is 2
Arity of '(a -> b) -> [a] -> [b]' is 2
``````

It's also interesting to note the values which don't work. For example, we fail to get the arity of the identity function `id :: a -> a`:

``main = putStrLn (format "a -> a" id)``
``````
/tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26742-0.hs:34:18: error:
• Ambiguous type variable ‘a0’ arising from a use of ‘format’
prevents the constraint ‘(ToInt (Arity a0))’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instances exist:
instance ToInt n => ToInt (Succ n)
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26742-0.hs:10:10
instance ToInt Zero
-- Defined at /tmp/nix-build-2015-11-12-calculating_arity.html.drv-0/runghcXXXX26742-0.hs:7:10
• In the first argument of ‘putStrLn’, namely
‘(format "a -> a" id)’
In the expression: putStrLn (format "a -> a" id)
In an equation for ‘main’: main = putStrLn (format "a -> a" id)
|
34 | main = putStrLn (format "a -> a" id)
|                  ^^^^^^^^^^^^^^^^^^
``````

This is because the output type of `id` is completely polymorphic: it could be anything, including a function. We know the arity must be at least 1, but it could be arbitrarily higher; we don't have enough information to know, and hence the compiler can't finish the calculation.

To get around this, we can restrict the output type of `id`. We don't have to completely monomorphise it, we just need to provide enough information to perform the type-level pattern-match for `Arity`:

``````main = mapM_ putStrLn [
format "Bool       -> Bool      " (id :: Bool       -> Bool)
, format "[a]        -> [a]       " (id :: [a]        -> [a])
, format "Maybe a    -> Maybe a   " (id :: Maybe a    -> Maybe a)
, format "(a -> [b]) -> (a -> [b])" (id :: (a -> [b]) -> (a -> [b]))
]``````
``````Arity of 'Bool       -> Bool      ' is 1
Arity of '[a]        -> [a]       ' is 1
Arity of 'Maybe a    -> Maybe a   ' is 1
Arity of '(a -> [b]) -> (a -> [b])' is 2
``````