Haskell Parsing

Posted on by Chris Warburton

I like s-expressions, but there are perfectly good arguments for avoiding them (e.g. “hard for people to read”), but most languages seem to ‘throw the baby out with the bathwater’ by having their human-friendly language be the only representation!

This has obvious problems for macros (since macros are “new syntax”, they’re unparseable using the existing grammar), but it also makes life harder for tool writers (linters, documentation extractors, coverage analysers, etc.) since everyone has to use a full-blown parser, and things might break due to irrelevant changes in the surface syntax (tasks like getting a list of function names shouldn’t break just because, say, a new language version adds a shorthand for pattern-matching).

A few years ago I did a lot of work in Haskell, and the situation there was just a mess. There were 3 de facto parsers and AST representations: haskell-src-exts as the recommended library, whilst the Template Haskell macro system used its own, and the GHC compiler defined its own as well. This was especially crazy given that:

Thankfully the situation seems to have improved somewhat in the past few years, with GHC’s API becoming saner, and hence more reliable. I may have to revisit my old tooling, to see if it can take advantage of this!