writing: 1e5042858af24829cf5aeaedb67f542ae0f558dd
1: \section{Haskell}
2: \label{sec:haskell}
3:
4: \begin{figure}
5: \centering
6: \begin{haskell}
7: -- A datatype with two constructors
8: data Bool = True | False
9:
10: -- A recursive datatype (S requires a Nat as argument)
11: data Nat = Z | S Nat
12:
13: -- A polymorphic (AKA "generic") datatype
14: data List t = Nil | Cons t (List t)
15:
16: -- Arithmetic functions
17:
18: plus :: Nat -> Nat -> Nat
19: plus Z y = y
20: plus (S x) y = S (plus x y)
21:
22: mult :: Nat -> Nat -> Nat
23: mult Z y = Z
24: mult (S x) y = plus y (mult x y)
25:
26: -- Mutually-recursive functions
27:
28: odd :: Nat -> Bool
29: odd Z = False
30: odd (S n) = even n
31:
32: even :: Nat -> Bool
33: even Z = True
34: even (S n) = odd n
35: \end{haskell}
36: \caption{Haskell code, defining datatypes and functions involving them (note
37: that \hs{--} introduces a comment). \hs{Bool} is the Booleans, \hs{Nat} is a
38: Peano encoding of the natural numbers and \hs{List t} are polymorphic lists
39: with elements of type \hs{t}. Juxtaposition denotes a function call, e.g.
40: \hs{f x}, and functions may be defined by pattern-matching (case analysis)
41: on their arguments. \hs{A :: B} is an optional type annotation, stating that
42: value \hs{A} has type \hs{B}. \hs{A -> B} is the type of functions from
43: \hs{A} to \hs{B}.}
44: \label{fig:haskellexample}
45: \end{figure}
46:
47: We mostly focus our attention on the Haskell programming language, both as an
48: implementation vehicle and as our representation of functions, properties, etc.
49: to explore. We choose Haskell since it combines formal, logical underpinnings
50: which aid reasoning (compared to more popular languages like Java or C), yet it
51: is still popular enough to sustain a rich ecosystem of tooling and a large
52: corpus of existing code (compared to more formal languages like Coq or
53: Isabelle). Haskell is well-suited to programming language research; indeed, this
54: was a goal of the language's creators \cite{marlow2010haskell}. An example of
55: Haskell code is given in Figure~\ref{fig:haskellexample}.
56:
57: Like most \emph{functional} programming languages, Haskell builds upon
58: $\lambda$-calculus, with extra features such as a strong type system and
59: ``syntactic sugar'' to improve readability. The following features make Haskell
60: especially useful for our purposes, although many are also present in other
61: languages such as StandardML and Coq (which we also use, but only when needed
62: for compatibility with prior work):
63:
64: \begin{description}
65:
66: \item{Functional}: Most control flow in Haskell (all except
67: \emph{pattern-matching}, described below) is performed by function abstraction
68: and application , which we can reason about using standard rules of inference
69: such as \emph{modus ponens}. For example, reading from external sources (like
70: environment variables) is \emph{impure}, so the only way to parameterise a
71: value of type \hs{B} with a value of type \hs{A} is using a function of type
72: \hs{A -> B}. Conversely, applying a function of type \hs{A -> B} to a value of
73: type \hs{A} can only ever produce a value of type \hs{B} (it may instead crash
74: the program or loop forever, but neither of those are \emph{observable}, i.e.
75: we can't branch on them).
76:
77: \item{Pure}: Execution of actions (e.g. reading or writing files) is separate to
78: evaluation of expressions; hence our reasoning can safely ignore complicated
79: external and non-local interactions. Purity ensures \emph{referential
80: transparency}: references can be substituted for their referent with
81: no change in semantics (as in standard mathematical practice). This implies
82: that calling a function twice with the same argument must produce the same
83: result (modulo crashes or infinite loops). For example, consider applying the
84: function \hs{pair x = (x, x)} to some arbitrary function call \hs{foo y}. The
85: references \hs{x} in the resulting pair \hs{(x, x)} both refer to \hs{foo y}
86: so, by referential transparency, can be substituted to get
87: \hs{(foo y, foo y)}. Both members of \hs{(x, x)} are identical, by definition;
88: hence, to preserve the semantics, both (and indeed \emph{all}) calls to
89: \hs{foo y} must also be identical. This holds regardless of what we choose for
90: \hs{foo} and \hs{y}, and implies that function behaviour cannot depend on
91: external interactions or non-deterministic processes (which may change between
92: calls).
93:
94: \item{Statically Typed}: Expressions are constrained by \emph{types}, which can
95: be used to eliminate unwanted combinations of values (for example
96: \hs{plus True} is not a valid expression), and hence reduce search spaces;
97: \emph{static} types can be deduced syntactically, without having to execute
98: the code. The type of Haskell expressions can also be \emph{inferred}
99: automatically.\footnote{Except for certain cases, such as those caused by the
100: \emph{monomorphism restriction}~\cite{marlow2010haskell}[$\S$4.5.5]}
101:
102: \item{Curried}: All functions in Haskell take a single argument (as in
103: $\lambda$-calculus), which makes them easier to manipulate programatically.
104: Currying allows multi-argument functions to be simulated, by accepting one
105: argument and returning a function to accept the rest. The \hs{mult} function
106: in Figure~\ref{fig:haskellexample} has type \hs{Nat -> Nat -> Nat} meaning it
107: takes a \hs{Nat} as argument and returns a function of type
108: \hs{Nat -> Nat}. Function calls are written with whitespace, so \hs{mult x y}
109: calls the \hs{mult} function with the argument \hs{x}, then calls the result
110: of that with the argument \hs{y}. This allows \emph{partial application} such
111: as \hs{double = mult (S (S Z))}, but for our purposes it is important for
112: unifying calling conventions. For example, in Javascript \hs{mult x y} would
113: be either \js{mult(x, y)} or \js{mult(x)(y)}, and depends on the definition of
114: \js{mult} (the problem compounds with more arguments). In Haskell there is no
115: distinction between these forms.
116:
117: \item{Non-strict}: ``Strict'' evaluation strategies evaluate the arguments of a
118: function call before the function body; non-strict does the opposite. The
119: Haskell standards do not specify a particular evaluation strategy, but they
120: do require that it be non-strict (for efficiency, most implementations use
121: \emph{lazy} evaluation to avoid duplicating work). Strictness can result in
122: infinite loops and other errors which may be avoided by non-strictness,
123: making the latter more useful for reasoning in the face of such errors.
124: For example, given a pair of values \hs{(x, y)} and a projection function
125: \hs{fst}, we might make the ``obvious'' conjecture that
126: \hs{fst (x, y) = x}.\footnote{Incidentally, this is also a valid definition of
127: the \hs{fst} function} This statement is true for non-strict languages, but
128: \emph{not} for strict languages. Crucially, a strict language will attempt to
129: calculate the value of \hs{y}, which may cause an infinite loop or other
130: error; a non-strict language like Haskell will ignore \hs{y} since it isn't
131: used in the body of the \hs{fst} function.
132:
133: \item{Algebraic Data Types}: These provide a rich grammar for building up
134: user-defined data representations, and an inverse mechanism to inspect these
135: data by \emph{pattern-matching} (Haskell's other form of control flow). The
136: \hs{Bool}, \hs{Nat} and \hs{List t} definitions in
137: Figure~\ref{fig:haskellexample} are ADTs; whilst the functions use
138: pattern-matching to branch on their first argument. For our purposes, the
139: useful consequences of ADTs and pattern-matching include their amenability for
140: inductive proofs and the fact they are \emph{closed}; i.e. an ADT's
141: declaration specifies all of the normal forms for that type. This makes
142: exhaustive case analysis trivial, which would be impossible for \emph{open}
143: types (for example, consider classes in an object oriented language, where new
144: subclasses can be introduced at any time).
145:
146: \item{Parametricity}: This allows Haskell \emph{values} to be parameterised over
147: \emph{type-level} objects; provided those objects are never inspected. This
148: enables \emph{polymorphism}. The \hs{List t} type in
149: Figure~\ref{fig:haskellexample} is an example: there are many useful functions
150: involving lists which work in the same way regardless of the element type
151: (e.g. getting the length, appending, reversing, etc.). An even simpler example
152: is the polymorphic identity function \hs{id x = x}. The type of \hs{id} is
153: \hs{forall t. t -> t}\footnote{The \hs{forall t.} is optional; type-level
154: identifiers beginning with a lowercase letter are assumed to be universally
155: quantified variables.}, which we can view as taking \emph{two} parameters:
156: a type \hs{t} and a value of type \hs{t}. Only the latter argument appears in
157: the definition (as \hs{x}), meaning that we can't use the type \hs{t} to
158: determine the function's behaviour. Indeed, in the case of \hs{id} we can't
159: branch on the value of \hs{x} either, since we don't know what type it might
160: have (our definition must work \emph{for all} types \hs{t}); the only
161: functions we can call on \hs{x} must also be polymorphic, and hence also
162: incapable of branching. The type of \hs{id} states that it returns a value of
163: type \hs{t}; without knowing what that type is, the only type-correct value it
164: can return is the argument \hs{x}. Hence the \emph{type} of \hs{id} tells us
165: everything about its behaviour, with this style of reasoning known as
166: \emph{theorems for free}~\cite{wadler1989theorems}. Haskell definitions are
167: commonly made polymorphic like this, to prevent incorrect implementations
168: passing the type checker, e.g. \hs{fst :: (Nat, Nat) -> Nat} might return the
169: wrong element, but \hs{fst :: (t1, t2) -> t1} can't.
170:
171: \item{Type classes}: Along with their various extensions, type classes are
172: interfaces which specify a set of operations over a type or other type-level
173: object (like a \emph{type constructor}, e.g. \hs{List}). Many type classes
174: also specify a set of \emph{laws} which their operations should obey but,
175: lacking a simple mechanism to enforce this, laws are usually considered as
176: documentation. As a simple example, we can define a type class \hs{Semigroup}
177: with the following operation and an associativity law:
178:
179: \begin{center}
180: \begin{haskell}
181: op :: forall t. Semigroup t => t -> t -> t
182: \end{haskell}
183:
184: $\forall \text{\hs{x y z}}, \text{\hs{op x (op y z)}} =
185: \text{\hs{op (op x y) z}}$
186: \end{center}
187:
188: The notation \hs{Semigroup t =>} is a \emph{type class constraint}, which
189: restricts the possible types \hs{t} to only those which implement
190: \hs{Semigroup}. \footnote{Alternatively, we can consider \hs{Semigroup t} as
191: the type of ``implementations of \hs{Semigroup} for \hs{t}'', in which case
192: \hs{=>} has a similar role to \hs{->} and we can consider \hs{op} to take
193: \emph{four} parameters: a type \hs{t}, an implementation of \hs{Semigroup t}
194: and two values of type \hs{t}. As with parameteric polymorphism, this extra
195: \hs{Semigroup t} parameter is not available at the value level. Even if it
196: were, we could not alter our behaviour by inspecting it, since Haskell only
197: allows types to implement each type class in at most one way, so there would
198: be no information to branch on.} There are many \emph{instances} of
199: \hs{Semigroup} (types which may be substituted for \hs{t}), e.g. \hs{Integer}
200: with \hs{op} performing addition. Many more examples can be found in the
201: \emph{typeclassopedia} \cite{yorgey2009typeclassopedia}. This ability to
202: constrain types, and the existence of laws, helps us reason about code
203: generically, rather than repeating the same arguments for each particular pair
204: of \hs{t} and \hs{op}.
205:
206: \item{Equational}: Haskell uses equations at the value level, for definitions;
207: at the type level, for coercions; at the documentation level, for typeclass
208: laws; and at the compiler level, for ad-hoc rewrite rules. This provides us
209: with many \emph{sources} of equations, as well as many possible \emph{uses}
210: for any equations we might discover. Along with their support in existing
211: tools such as SMT solvers, this makes equational conjectures a natural target
212: for theory exploration.
213:
214: \item{Modularity}: Haskell's module system allows definitions to be kept
215: private. This mechanism allows modules to provide more guarantees than are
216: available just in their types, by constraining the ways that values can be
217: constructed. For example, the following module represents email addresses as a
218: pair of \hs{String}s, one for the user part and one for the host:
219:
220: \begin{haskell}
221: -- Exports appear between the parentheses
222: module Email (Email(), at, render) where
223:
224: data Email = E String String
225:
226: render :: Email -> String
227: render (E user host) = user ++ "@" ++ host
228:
229: -- if/then/else is sugar for pattern-matching a Bool
230: at :: String -> String -> Maybe Email
231: at user host = if user == "" || host == ""
232: then Nothing
233: else Just (E user host)
234: \end{haskell}
235:
236: An \hs{Email} value can be constructed by passing any two \hs{String}s to
237: \hs{E}, but \hs{E} is private (not exported). The \hs{at} function is
238: exported, but only passes its arguments to \hs{E} iff they are not
239: empty.\footnote{\hs{Maybe t} is a safer alternative to the \textsc{Null}
240: construct of other languages~\cite{hoare2009null}. It is defined as
241: \hs{data Maybe t = Nothing | Just t} and can be understood as an optional
242: value, or a computation which may fail, or as a list with at most one
243: element, or as a degenerate search tree with no
244: backtracking~\cite{wadler1985replace}} Since this module never calls \hs{E}
245: with empty \hs{String}s, and other modules must use \hs{at}, we're guaranteed
246: that \emph{all} \hs{Email} values will have non-empty \hs{String}s. Such
247: ``smart constructors'' can guarantee \emph{any} decidable property, at the
248: cost of performing run-time checks on each invocation.\footnote{We can
249: guarantee non-emptiness ``by construction'', without run-time checks or
250: \hs{Maybe} wrappers, by changing the type to require at least one element.
251: \hs{String} is equivalent to \hs{List Char}, with \hs{""} represented as
252: \hs{Nil}. Changing \hs{Nil} to require a \hs{Char} would eliminate empty
253: \hs{String}s, e.g. \hs{data NonEmpty t = Nil t | Cons t (NonEmpty t)}. We
254: could also use a pair like \hs{data NonEmpty t = NE t (List t)} instead.
255: Such precise types are often more desirable than smart constructors, but are
256: less general since ad-hoc representations need to be invented to enforce
257: each guarantee.}
258:
259: \end{description}
260:
261: Together, these features make Haskell code highly structured, amenable to
262: logical analysis and subject to many algebraic laws. However, as mentioned with
263: regards to type classes, Haskell itself is incapable of expressing or enforcing
264: these laws (at least, without difficulty \cite{lindley2014hasochism}). This
265: reduces the incentive to manually discover, state and prove theorems about
266: Haskell code, e.g. in the style of interactive theorem proving, as these results
267: may be invalidated by seemingly innocuous code changes. This puts Haskell in a
268: rather special position with regards to the discovery of interesting theorems;
269: namely that many discoveries may be available with very little work, simply
270: because the code's authors are focused on \emph{software} development rather
271: than \emph{proof} development. The same cannot be said, for example, of ITP
272: systems; although our reasoning capabilities may be stronger in an ITP setting,
273: much of the ``low hanging fruit'' will have already been found through the
274: user's dedicated efforts, and hence theory exploration would be less likely to
275: discover unexpected properties.
276:
277: Other empirical advantages to studying Haskell, compared to other programming
278: languages or theorem proving systems, include:
279:
280: \begin{itemize}
281: \item The large amount of Haskell code which is freely available online, e.g. in
282: repositories like \hackage{}, with which we
283: can experiment.
284:
285: \item The existence of theory exploration systems such as \hipspec{},
286: \quickspec{} and \speculate{}.
287:
288: \item Related tooling we can re-use such as counterexample finders (\quickcheck{},
289: \smallcheck{}, \smartcheck{}, \leancheck{},
290: \hedgehog{}, etc.), theorem provers
291: (e.g. \hip{}~\cite{rosen2012proving}), and other testing and
292: term-generating systems like \mucheck{}~\cite{le2014mucheck},
293: \magichaskeller{}~\cite{katayama2011magichaskeller} and
294: \djinn{}~\cite{augustsson2005djinn}.
295:
296: \item The remarkable amount of infrastructure which exists for working with
297: Haskell code, including package managers, compilers, interpreters, parsers,
298: static analysers, etc.
299: \end{itemize}
300:
301: Further evidence of Haskell's suitability for theory exploration is given by the
302: fact that the state-of-the-art implementation for Isabelle/HOL, the
303: \hipster{}~\cite{Hipster} system, is actually implemented by translating
304: to Haskell and invoking \hipspec{}~\cite{claessen2013automating}.
Generated by git2html.