S-expressions
S-expressions are
a serialisation format. Their simplicity makes it easy to write parsers,
pretty-printers, translators, preprocessors, editor plugins, graphical
editors, etc. so you don’t have to inspect the serialised form if you
don’t want to. Similar to how there are loads of ways to make a Web site
which don’t involve hand-editing HTML, e.g. we can convert from
something like markdown/asciidoc/mediawiki/bbcode/etc., we can generate
pages programmatically via Haskell/PHP/Python/etc., we can use a WYSIWYG
editor, etc. Since s-expressions are much simpler than HTML, using such
abstractions is nowhere near as “leaky” (s-expressions just use
(
and )
rather than arbitrary XML tags,
there’s no tag/attribute redundancy, all text is double-quoted, there
are no abbreviations to expand (like namespaces), etc.).
If you want to, it’s pretty easy to make your own alternative ‘interface’ to such data. There are already loads out there too, e.g.
Whilst the parenthesis-heavy format of s-expressions is not necessary, it usually crops up in anything discussing Lisp and its derivatives, simply because it’s much more popular than these alternatives. To me, that mostly indicates that concerns about “too many parentheses” are really a non-issue, despite being made by many who are new to the format.
My Approach
I’m a heavy Emacs user, so I use Emacs to edit everything. My ‘solution’ to the parentheses ‘problem’ is to set their colour to a very low contrast whenever the buffer contains an s-expression language; here’s the relevant Emacs config (written in s-expressions!):
;; Make parentheses dimmer when editing LISP
(defface paren-face
'((((class color) (background dark))
(:foreground "grey30"))
(((class color) (background light))
(:foreground "grey30")))
"Face used to dim parentheses.")
(mapcar (lambda (mode)
(add-hook mode
(lambda ()
(font-lock-add-keywords nil
'(("(\\|)" . 'paren-face))))))
'(emacs-lisp-mode-hook scheme-mode-hook racket-mode-hook))
I use show-paren-mode to highlight matching parentheses which are next to the cursor, but otherwise just “tune out” the parentheses in favour of indentation (the way Emacs indents s-expressions is nice enough that I seldom fiddle with it):
For particular s-expressions-based languages, like Racket, we can get syntax colouring for symbols, etc. by using the corresponding Emacs mode; this also lets us trigger flycheck syntax checking, and so on.
One of the payoffs of editing a serialised format like s-expressions is that we can use tools like paredit and smartparens to avoid having to care about the textual representation at all: they make navigating and manipulating the syntax tree structure and content relatively nice, they ensure that parentheses and quoted strings always remain balanced, they automatically escape characters when written inside strings, etc.
Whilst we could, in theory, make similar tree navigators for
languages with more complicated textual representations like, say,
Haskell, in practice these aren’t as useful since these files are mostly
in an unparseable state during editing; for example we might have
written a let
but not yet written an in
, or a
case
without an of
, or an =
without a right-hand-side, etc.
In principle we can solve these in the same way as paredit: insert
the whole language construct at once, and let the user fill in the gaps;
yet this requires a whole raft of language-specific constructs, whilst
paredit can get away with (
/)
,
[
/]
, {
/}
and
"
/"
for basically any language. It also
requires custom keybindings to avoid ambiguity: whilst paredit can take
over the (
key to insert a balanced ()
pair,
there is no let
key; the best we could do would be hooking
into the spacebar and checking if we’ve just opened a let
.
In any case, our files would still be unparseable until the
user’s filled in all of the gaps: for example let in
is
invalid; let x in
is invalid; let x = in
is
invalid; let x = 42 in
is invalid; only when we reach
let x = 42 in y
will the parser not choke. We could put in
placeholders like let _ = _ in _
, but then we’d need to
decide whether the user wants to insert a character or overwrite a
placeholder; and so on.
Far nicer to expose the tree structure separately, which can be managed without knowing anything about the syntax of our language.