writing: 1e5042858af24829cf5aeaedb67f542ae0f558dd

     1: \section{Haskell}
     2: \label{sec:haskell}
     3: 
     4: \begin{figure}
     5:   \centering
     6:   \begin{haskell}
     7: -- A datatype with two constructors
     8: data Bool = True | False
     9: 
    10: -- A recursive datatype (S requires a Nat as argument)
    11: data Nat = Z | S Nat
    12: 
    13: -- A polymorphic (AKA "generic") datatype
    14: data List t = Nil | Cons t (List t)
    15: 
    16: -- Arithmetic functions
    17: 
    18: plus :: Nat -> Nat -> Nat
    19: plus    Z  y = y
    20: plus (S x) y = S (plus x y)
    21: 
    22: mult :: Nat -> Nat -> Nat
    23: mult    Z  y = Z
    24: mult (S x) y = plus y (mult x y)
    25: 
    26: -- Mutually-recursive functions
    27: 
    28: odd :: Nat -> Bool
    29: odd    Z  = False
    30: odd (S n) = even n
    31: 
    32: even :: Nat -> Bool
    33: even    Z  = True
    34: even (S n) = odd n
    35:   \end{haskell}
    36:   \caption{Haskell code, defining datatypes and functions involving them (note
    37:     that \hs{--} introduces a comment). \hs{Bool} is the Booleans, \hs{Nat} is a
    38:     Peano encoding of the natural numbers and \hs{List t} are polymorphic lists
    39:     with elements of type \hs{t}. Juxtaposition denotes a function call, e.g.
    40:     \hs{f x}, and functions may be defined by pattern-matching (case analysis)
    41:     on their arguments. \hs{A :: B} is an optional type annotation, stating that
    42:     value \hs{A} has type \hs{B}. \hs{A -> B} is the type of functions from
    43:     \hs{A} to \hs{B}.}
    44:   \label{fig:haskellexample}
    45: \end{figure}
    46: 
    47: We mostly focus our attention on the Haskell programming language, both as an
    48: implementation vehicle and as our representation of functions, properties, etc.
    49: to explore. We choose Haskell since it combines formal, logical underpinnings
    50: which aid reasoning (compared to more popular languages like Java or C), yet it
    51: is still popular enough to sustain a rich ecosystem of tooling and a large
    52: corpus of existing code (compared to more formal languages like Coq or
    53: Isabelle). Haskell is well-suited to programming language research; indeed, this
    54: was a goal of the language's creators \cite{marlow2010haskell}. An example of
    55: Haskell code is given in Figure~\ref{fig:haskellexample}.
    56: 
    57: Like most \emph{functional} programming languages, Haskell builds upon
    58: $\lambda$-calculus, with extra features such as a strong type system and
    59: ``syntactic sugar'' to improve readability. The following features make Haskell
    60: especially useful for our purposes, although many are also present in other
    61: languages such as StandardML and Coq (which we also use, but only when needed
    62: for compatibility with prior work):
    63: 
    64: \begin{description}
    65: 
    66: \item{Functional}: Most control flow in Haskell (all except
    67:   \emph{pattern-matching}, described below) is performed by function abstraction
    68:   and application , which we can reason about using standard rules of inference
    69:   such as \emph{modus ponens}. For example, reading from external sources (like
    70:   environment variables) is \emph{impure}, so the only way to parameterise a
    71:   value of type \hs{B} with a value of type \hs{A} is using a function of type
    72:   \hs{A -> B}. Conversely, applying a function of type \hs{A -> B} to a value of
    73:   type \hs{A} can only ever produce a value of type \hs{B} (it may instead crash
    74:   the program or loop forever, but neither of those are \emph{observable}, i.e.
    75:   we can't branch on them).
    76: 
    77: \item{Pure}: Execution of actions (e.g. reading or writing files) is separate to
    78:   evaluation of expressions; hence our reasoning can safely ignore complicated
    79:   external and non-local interactions. Purity ensures \emph{referential
    80:     transparency}: references can be substituted for their referent with
    81:   no change in semantics (as in standard mathematical practice). This implies
    82:   that calling a function twice with the same argument must produce the same
    83:   result (modulo crashes or infinite loops). For example, consider applying the
    84:   function \hs{pair x = (x, x)} to some arbitrary function call \hs{foo y}. The
    85:   references \hs{x} in the resulting pair \hs{(x, x)} both refer to \hs{foo y}
    86:   so, by referential transparency, can be substituted to get
    87:   \hs{(foo y, foo y)}. Both members of \hs{(x, x)} are identical, by definition;
    88:   hence, to preserve the semantics, both (and indeed \emph{all}) calls to
    89:   \hs{foo y} must also be identical. This holds regardless of what we choose for
    90:   \hs{foo} and \hs{y}, and implies that function behaviour cannot depend on
    91:   external interactions or non-deterministic processes (which may change between
    92:   calls).
    93: 
    94: \item{Statically Typed}: Expressions are constrained by \emph{types}, which can
    95:   be used to eliminate unwanted combinations of values (for example
    96:   \hs{plus True} is not a valid expression), and hence reduce search spaces;
    97:   \emph{static} types can be deduced syntactically, without having to execute
    98:   the code. The type of Haskell expressions can also be \emph{inferred}
    99:   automatically.\footnote{Except for certain cases, such as those caused by the
   100:     \emph{monomorphism restriction}~\cite{marlow2010haskell}[$\S$4.5.5]}
   101: 
   102: \item{Curried}: All functions in Haskell take a single argument (as in
   103:   $\lambda$-calculus), which makes them easier to manipulate programatically.
   104:   Currying allows multi-argument functions to be simulated, by accepting one
   105:   argument and returning a function to accept the rest. The \hs{mult} function
   106:   in Figure~\ref{fig:haskellexample} has type \hs{Nat -> Nat -> Nat} meaning it
   107:   takes a \hs{Nat} as argument and returns a function of type
   108:   \hs{Nat -> Nat}. Function calls are written with whitespace, so \hs{mult x y}
   109:   calls the \hs{mult} function with the argument \hs{x}, then calls the result
   110:   of that with the argument \hs{y}. This allows \emph{partial application} such
   111:   as \hs{double = mult (S (S Z))}, but for our purposes it is important for
   112:   unifying calling conventions. For example, in Javascript \hs{mult x y} would
   113:   be either \js{mult(x, y)} or \js{mult(x)(y)}, and depends on the definition of
   114:   \js{mult} (the problem compounds with more arguments). In Haskell there is no
   115:   distinction between these forms.
   116: 
   117: \item{Non-strict}: ``Strict'' evaluation strategies evaluate the arguments of a
   118:   function call before the function body; non-strict does the opposite. The
   119:   Haskell standards do not specify a particular evaluation strategy, but they
   120:   do require that it be non-strict (for efficiency, most implementations use
   121:   \emph{lazy} evaluation to avoid duplicating work). Strictness can result in
   122:   infinite loops and other errors which may be avoided by non-strictness,
   123:   making the latter more useful for reasoning in the face of such errors.
   124:   For example, given a pair of values \hs{(x, y)} and a projection function
   125:   \hs{fst}, we might make the ``obvious''  conjecture that
   126:   \hs{fst (x, y) = x}.\footnote{Incidentally, this is also a valid definition of
   127:     the \hs{fst} function} This statement is true for non-strict languages, but
   128:   \emph{not} for strict languages. Crucially, a strict language will attempt to
   129:   calculate the value of \hs{y}, which may cause an infinite loop or other
   130:   error; a non-strict language like Haskell will ignore \hs{y} since it isn't
   131:   used in the body of the \hs{fst} function.
   132: 
   133: \item{Algebraic Data Types}: These provide a rich grammar for building up
   134:   user-defined data representations, and an inverse mechanism to inspect these
   135:   data by \emph{pattern-matching} (Haskell's other form of control flow). The
   136:   \hs{Bool}, \hs{Nat} and \hs{List t} definitions in
   137:   Figure~\ref{fig:haskellexample} are ADTs; whilst the functions use
   138:   pattern-matching to branch on their first argument. For our purposes, the
   139:   useful consequences of ADTs and pattern-matching include their amenability for
   140:   inductive proofs and the fact they are \emph{closed}; i.e. an ADT's
   141:   declaration specifies all of the normal forms for that type. This makes
   142:   exhaustive case analysis trivial, which would be impossible for \emph{open}
   143:   types (for example, consider classes in an object oriented language, where new
   144:   subclasses can be introduced at any time).
   145: 
   146: \item{Parametricity}: This allows Haskell \emph{values} to be parameterised over
   147:   \emph{type-level} objects; provided those objects are never inspected. This
   148:   enables \emph{polymorphism}. The \hs{List t} type in
   149:   Figure~\ref{fig:haskellexample} is an example: there are many useful functions
   150:   involving lists which work in the same way regardless of the element type
   151:   (e.g. getting the length, appending, reversing, etc.). An even simpler example
   152:   is the polymorphic identity function \hs{id x = x}. The type of \hs{id} is
   153:   \hs{forall t. t -> t}\footnote{The \hs{forall t.} is optional; type-level
   154:     identifiers beginning with a lowercase letter are assumed to be universally
   155:     quantified variables.}, which we can view as taking \emph{two} parameters:
   156:   a type \hs{t} and a value of type \hs{t}. Only the latter argument appears in
   157:   the definition (as \hs{x}), meaning that we can't use the type \hs{t} to
   158:   determine the function's behaviour. Indeed, in the case of \hs{id} we can't
   159:   branch on the value of \hs{x} either, since we don't know what type it might
   160:   have (our definition must work \emph{for all} types \hs{t}); the only
   161:   functions we can call on \hs{x} must also be polymorphic, and hence also
   162:   incapable of branching. The type of \hs{id} states that it returns a value of
   163:   type \hs{t}; without knowing what that type is, the only type-correct value it
   164:   can return is the argument \hs{x}. Hence the \emph{type} of \hs{id} tells us
   165:   everything about its behaviour, with this style of reasoning known as
   166:   \emph{theorems for free}~\cite{wadler1989theorems}. Haskell definitions are
   167:   commonly made polymorphic like this, to prevent incorrect implementations
   168:   passing the type checker, e.g. \hs{fst :: (Nat, Nat) -> Nat} might return the
   169:   wrong element, but \hs{fst :: (t1, t2) -> t1} can't.
   170: 
   171: \item{Type classes}: Along with their various extensions, type classes are
   172:   interfaces which specify a set of operations over a type or other type-level
   173:   object (like a \emph{type constructor}, e.g. \hs{List}). Many type classes
   174:   also specify a set of \emph{laws} which their operations should obey but,
   175:   lacking a simple mechanism to enforce this, laws are usually considered as
   176:   documentation. As a simple example, we can define a type class \hs{Semigroup}
   177:   with the following operation and an associativity law:
   178: 
   179:   \begin{center}
   180:     \begin{haskell}
   181: op :: forall t. Semigroup t => t -> t -> t
   182:     \end{haskell}
   183: 
   184:     $\forall \text{\hs{x y z}}, \text{\hs{op x (op y z)}} =
   185:                                 \text{\hs{op (op x y) z}}$
   186:   \end{center}
   187: 
   188:   The notation \hs{Semigroup t =>} is a \emph{type class constraint}, which
   189:   restricts the possible types \hs{t} to only those which implement
   190:   \hs{Semigroup}. \footnote{Alternatively, we can consider \hs{Semigroup t} as
   191:     the type of ``implementations of \hs{Semigroup} for \hs{t}'', in which case
   192:   \hs{=>} has a similar role to \hs{->} and we can consider \hs{op} to take
   193:   \emph{four} parameters: a type \hs{t}, an implementation of \hs{Semigroup t}
   194:   and two values of type \hs{t}. As with parameteric polymorphism, this extra
   195:   \hs{Semigroup t} parameter is not available at the value level. Even if it
   196:   were, we could not alter our behaviour by inspecting it, since Haskell only
   197:   allows types to implement each type class in at most one way, so there would
   198:   be no information to branch on.} There are many \emph{instances} of
   199:   \hs{Semigroup} (types which may be substituted for \hs{t}), e.g. \hs{Integer}
   200:   with \hs{op} performing addition. Many more examples can be found in the
   201:   \emph{typeclassopedia} \cite{yorgey2009typeclassopedia}. This ability to
   202:   constrain types, and the existence of laws, helps us reason about code
   203:   generically, rather than repeating the same arguments for each particular pair
   204:   of \hs{t} and \hs{op}.
   205: 
   206: \item{Equational}: Haskell uses equations at the value level, for definitions;
   207:   at the type level, for coercions; at the documentation level, for typeclass
   208:   laws; and at the compiler level, for ad-hoc rewrite rules. This provides us
   209:   with many \emph{sources} of equations, as well as many possible \emph{uses}
   210:   for any equations we might discover. Along with their support in existing
   211:   tools such as SMT solvers, this makes equational conjectures a natural target
   212:   for theory exploration.
   213: 
   214: \item{Modularity}: Haskell's module system allows definitions to be kept
   215:   private. This mechanism allows modules to provide more guarantees than are
   216:   available just in their types, by constraining the ways that values can be
   217:   constructed. For example, the following module represents email addresses as a
   218:   pair of \hs{String}s, one for the user part and one for the host:
   219: 
   220:   \begin{haskell}
   221: -- Exports appear between the parentheses
   222: module Email (Email(), at, render) where
   223: 
   224: data Email = E String String
   225: 
   226: render :: Email -> String
   227: render (E user host) = user ++ "@" ++ host
   228: 
   229: -- if/then/else is sugar for pattern-matching a Bool
   230: at :: String -> String -> Maybe Email
   231: at user host = if user == "" || host == ""
   232:                   then Nothing
   233:                   else Just (E user host)
   234:   \end{haskell}
   235: 
   236:   An \hs{Email} value can be constructed by passing any two \hs{String}s to
   237:   \hs{E}, but \hs{E} is private (not exported). The \hs{at} function is
   238:   exported, but only passes its arguments to \hs{E} iff they are not
   239:   empty.\footnote{\hs{Maybe t} is a safer alternative to the \textsc{Null}
   240:     construct of other languages~\cite{hoare2009null}. It is defined as
   241:     \hs{data Maybe t = Nothing | Just t} and can be understood as an optional
   242:     value, or a computation which may fail, or as a list with at most one
   243:     element, or as a degenerate search tree with no
   244:     backtracking~\cite{wadler1985replace}} Since this module never calls \hs{E}
   245:   with empty \hs{String}s, and other modules must use \hs{at}, we're guaranteed
   246:   that \emph{all} \hs{Email} values will have non-empty \hs{String}s. Such
   247:   ``smart constructors'' can guarantee \emph{any} decidable property, at the
   248:   cost of performing run-time checks on each invocation.\footnote{We can
   249:     guarantee non-emptiness ``by construction'', without run-time checks or
   250:     \hs{Maybe} wrappers, by changing the type to require at least one element.
   251:     \hs{String} is equivalent to \hs{List Char}, with \hs{""} represented as
   252:     \hs{Nil}. Changing \hs{Nil} to require a \hs{Char} would eliminate empty
   253:     \hs{String}s, e.g. \hs{data NonEmpty t = Nil t | Cons t (NonEmpty t)}. We
   254:     could also use a pair like \hs{data NonEmpty t = NE t (List t)} instead.
   255:     Such precise types are often more desirable than smart constructors, but are
   256:     less general since ad-hoc representations need to be invented to enforce
   257:     each guarantee.}
   258: 
   259: \end{description}
   260: 
   261: Together, these features make Haskell code highly structured, amenable to
   262: logical analysis and subject to many algebraic laws. However, as mentioned with
   263: regards to type classes, Haskell itself is incapable of expressing or enforcing
   264: these laws (at least, without difficulty \cite{lindley2014hasochism}). This
   265: reduces the incentive to manually discover, state and prove theorems about
   266: Haskell code, e.g. in the style of interactive theorem proving, as these results
   267: may be invalidated by seemingly innocuous code changes. This puts Haskell in a
   268: rather special position with regards to the discovery of interesting theorems;
   269: namely that many discoveries may be available with very little work, simply
   270: because the code's authors are focused on \emph{software} development rather
   271: than \emph{proof} development. The same cannot be said, for example, of ITP
   272: systems; although our reasoning capabilities may be stronger in an ITP setting,
   273: much of the ``low hanging fruit'' will have already been found through the
   274: user's dedicated efforts, and hence theory exploration would be less likely to
   275: discover unexpected properties.
   276: 
   277: Other empirical advantages to studying Haskell, compared to other programming
   278: languages or theorem proving systems, include:
   279: 
   280: \begin{itemize}
   281: \item The large amount of Haskell code which is freely available online, e.g. in
   282:   repositories like \hackage{}, with which we
   283:   can experiment.
   284: 
   285: \item The existence of theory exploration systems such as \hipspec{},
   286:   \quickspec{} and \speculate{}.
   287: 
   288: \item Related tooling we can re-use such as counterexample finders (\quickcheck{},
   289:   \smallcheck{}, \smartcheck{}, \leancheck{},
   290:   \hedgehog{}, etc.), theorem provers
   291:   (e.g. \hip{}~\cite{rosen2012proving}), and other testing and
   292:   term-generating systems like \mucheck{}~\cite{le2014mucheck},
   293:   \magichaskeller{}~\cite{katayama2011magichaskeller} and
   294:   \djinn{}~\cite{augustsson2005djinn}.
   295: 
   296: \item The remarkable amount of infrastructure which exists for working with
   297:   Haskell code, including package managers, compilers, interpreters, parsers,
   298:   static analysers, etc.
   299: \end{itemize}
   300: 
   301: Further evidence of Haskell's suitability for theory exploration is given by the
   302: fact that the state-of-the-art implementation for Isabelle/HOL, the
   303: \hipster{}~\cite{Hipster} system, is actually implemented by translating
   304: to Haskell and invoking \hipspec{}~\cite{claessen2013automating}.
Generated by git2html.