Staging

Staging and metaprogramming are the same; Template Haskell is a metaprogramming library for Haskell. We have different stages of computation, so staging a program is making it so that it compiles different parts of the program at different stages to be used by other parts of a program at a later stage.

Building blocks of Template Haskell

The core mechanisms of Template Haskell are:

Evaluating Haskell meta-programs at compile-time and splicing in the generated object programs as regular Haskell code
Representing Template Haskell object programs as algebraic data types
The quotation monad Q

• Writing a meta-program

As an introductory example, consider the curry function

curry :: ((a,b) -> c) -> a -> b -> c

Unfortunately, there are no Prelude functions that provide the same currying functionality for functions taking arbitrary n-tuples. Manually writing these is tedious - instead we wish to generate functions through a single meta program on demand. Template Haskell lets us do this.

The idea is to write a meta function curryN :: Int -> Q Exp which, given a number n >= 1, constructs the source code for an n-ary curry function:

{-# LANGUAGE TemplateHaskell #-}
import Control.Monad
import Language.Haskell.TH

curryN :: Int -> Q Exp
curryN n = do
  f  <- newName "f"
  xs <- replicateM n (newName "x")
  let args = map VarP (f:xs)
      ntup = TupE (map VarE xs)
  return $ LamE args (AppE (VarE f) ntup)

The meta function curryN takes an input n and builds a lambda abstraction LamE that first pattern matches against a function f and n argument variables x1, x2, ..., xn, and then applies function f to the n-tuple (x1, x2, ..., xn).
The function newName is used to monadically generate the names used to refer to the variables f and x1 through xn, such that these names are always fresh. Hence the value returned by curryN is a monadic computation of type Q Exp.
When executed, this monadic computation yields an expression Exp representing the object program of an n-ary curry function.

For example, curryN 3 returns a monadic computation that yields an expression representing a curry3 function of type ((a, b, c) -> d) -> a -> b -> c -> d in abstract syntax.

• Running a meta-program

To run a meta program like curryN at compile-time, we enclose it with Template Haskell’s splice operator $, by writing:

$(curryN 3)

This evaluates the meta program curryN and puts the resulting object program \f x1 x2 x3 -> f (x1, x2, x3) in place of the splice.

In general, the splice operator $ can be applied to any monadic Q computation, resulting in performing this computation at compile-time and inserting the resulting object program as real Haskell code.

Meta programs are type-checked beforehand to yield a valid Template Haskell object program, therefore only imported, fully type-checked meta programs can be run via the splice operator $. In particular, we have to evaluate the meta program $(curryN 3) in a separate module to when curryN is defined.

To generate function declarations for the first n curry functions, we can devise a further meta program on top of curryN:

genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
   where mkCurryDec ith = do
             cury <- curryN ith
             let name = mkName $ curry" ++ show ith
             return $ FunD name [Clause [] (NormalB cury) []]

In this case, the function genCurries will return a list of top-level function declarations that bind the anonymous lambda abstractions built by curryN.
Also, the function mkName is used instead of newName, because we want to generate functions curry1 to curry20 with exactly the prescribed names, so they can be captured and referred to from other parts of the program.

Running $(genCurries 20) will then splice in the first 20 curry functions at compile-time:

curry1  = \ f x1 -> f (x1)
curry2  = \ f x1 x2 -> f (x1, x2)
curry3  = \ f x1 x2 x3 -> f (x1, x2, x3)
curry4  = \ f x1 x2 x3 x4 -> f (x1, x2, x3, x4)
...
curry20 = \ f x1 x2 ... x20 -> f (x1, x2, ..., x20)

• Object programs as algebraic data types

Object programs created by Template Haskell are represented as regular algebraic data types, describing a program in the form of an abstract syntax tree. The Template Haskell library provides algebraic data types Exp, Pat, Dec, and Type to represent Haskell’s surface-level syntax of expressions, patterns, declarations, and types. Virtually all concrete Haskell syntactic constructs have a corresponding abstract syntax constructor in one of these four ADTs. Furthermore, all Haskell identifiers are represented by the Name data type. By representing object programs as regular ADTs, normal Haskell can be used as the meta-programming language to build object programs.

• Quotation Monad Q

Template Haskell object programs are built inside the quotation monad Q. This monad is performed by the splice operator $ at compile-time as part of evaluating the meta program.

Generation of code requires certain features to be available to use:

The ability to generate fresh names that cannot be captured.
The ability to retrieve information about something by its name.
The ability to put and get some custom state that is then shared by all Template Haskell code in the same module.
The ability to run IO during compilation so that we can, for example, read something from a file.

The quotation monad Q hosts all the functions provided by Template Haskell.

The process of Template Haskell:

Template Haskell’s core functionality constitutes evaluating object programs with splice $ and building them from algebraic data types inside the quotation monad Q. However, constructing object programs in terms of their abstract syntax trees is verbose and leads to clumsy meta programs. Therefore, the Template Haskell API also provides two further interfaces to build object programs more conveniently:

Syntax construction functions
Quotation brackets

• Syntax construction functions

For every syntax constructor, there is a corresponding monadic syntax construction function provided. Syntax construction functions directly relate to the syntax constructors from the algebraic data types Exp, Pat, Dec, and Type for representing Haskell code. However, they also let us hide the monadic nature when building object programs.

To elaborate, recall the definition of the meta function genCurries:

genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
   where mkCurryDec ith = do
             cury <- curryN ith
             let name = mkName $ curry" ++ show ith
             return $ FunD name [Clause [] (NormalB cury) []]

To use the object program generated by the sub call to curryN in the larger context of the returned function declaration, we have to first perform curryN and bind its result to cury. This is because we have to account for curryN’s generation of fresh names before we can continue. However, using syntax construction functions instead of data constructors abstracts from the monadic construction of genCurries, thus making its code a little shorter.

genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
   where mkCurryDec ith = funD name [clause [] (normalB (curryN ith)) []]
            where name = mkName $ "curry" ++ show ith

The new funD, clause, and normalB functions directly correspond to the formerly used FunD, Clause, and NormalB constructors. The only difference is their types:

FunD :: Name -> [Clause]   -> Dec
funD :: Name -> [Q Clause] -> Q Dec

Clause :: [Pat] -> Body -> Clause
clause :: [Q Pat] -> Q Body -> Q Clause

NormalB :: Exp -> Body
normalB :: Q Exp -> Q Body

The syntax constructors (FunD, Clause, NormalB) work with raw Template Haskell expressions. The syntax constructor functions (funD, clause, normalB) work with the monadic counterparts of raw Template Haskell expressions. They construct a Template Haskell object program directly in Q - this frees the API consumer from doing the monadic wrapping and unwrapping manually.

• Quotation brackets

Quotation brackets are a method to further simplify the representation of Haskell code. They let us specify an object program using regular Haskell syntax, by enclosing it inside oxford brackets:

[| .. |]

With this, object programs can be specified much more succinctly.

For example, using either ADTs or syntax construction functions to express a meta program which builds a Haskell expression for the identity function is quite verbose:
```
genId :: Q Exp
genId = do
  x <- newName "x"
  lamE [varP x] (varE x)
```
Whereas using quotation brackets, writing the same meta program can be simplified to:
```
genId :: Q Exp
genId = [| \x -> x |]
```

Quotation brackets quote regular Haskell code as the corresponding object program fragments inside the Q monad. There are quotation brackets for quoting Haskell expressions ([e| .. |]|), patterns ([p| .. |]), declarations ([d| .. |]), and types ([t| .. |]). Writing [| .. |] is hereby just another way of saying [e| .. |].

Using quotation brackets “lifts” concrete Haskell syntax into corresponding object program expressions inside the Q monad. By doing so, quotation brackets represent the dual of the already introduced splice operator $: evaluating a meta program with $ splices in the generated object program as real Haskell code; in contrast, quotation brackets [| .. |] turn real Haskell code into an object program. Consequently, quotation brackets and the splice operator cancel each other out. The equation $([| e |]) = e holds for all expressions e, and similar equations hold for declarations, and types.

• Reification

A major feature of Template Haskell is program reification. Reification allows a meta program to query compile-time information about other program parts whilst constructing the object program. This allows the meta program to inspect other program components to answer questions such as:

What is this variable’s type?
What are the class instances of this type class?
Which constructors does this data type have and what do they look like?

The main use-case of reification is to generate boilerplate code which auto-completes manually written code.

An example of this, is to generically derive type class instances from bare data type definitions. Suppose we define the following polymorphic data types for representing potentially erroneous values, lists, and binary trees:

data Result e a = Err e  | Ok a
data List     a = Nil    | Cons a (List a)
data Tree     a = Leaf a | Node (Tree a) a (Tree a)

Then, suppose we want to derive Functor instances for all of these types automatically. To make a type constructor T an instance of Functor, we need to implement the method fmap :: (a -> b) -> T a -> T b. Its definition is precisely determined by parametricity and the functor laws:

By parametricity, all values of type a must be replaced according to the provided function with values of type b.
By the functor laws, all other shapes of the input value of type T a must be preserved when transforming it to the output value of type T b.

The meta function deriveFunctor :: Name -> Q [Dec] implements the idea of this algorithm:

data Deriving = Deriving { tyCon :: Name, tyVar :: Name }

deriveFunctor :: Name -> Q [Dec]
deriveFunctor ty
  = do (TyConI tyCon) <- reify ty
       (tyConName, tyVars, cs) <- case tyCon of
         DataD _ nm tyVars cs _   -> return (nm, tyVars, cs)
         NewtypeD _ nm tyVars c _ -> return (nm, tyVars, [c])
         _ -> fail "deriveFunctor: tyCon may not be a type synonym."

       let (KindedTV tyVar StarT) = last tyVars
           instanceType           = conT ''Functor `appT`
             (foldl apply (conT tyConName) (init tyVars))

       putQ $ Deriving tyConName tyVar
       sequence [instanceD (return []) instanceType [genFmap cs]]
  where
    apply t (PlainTV name)    = appT t (varT name)
    apply t (KindedTV name _) = appT t (varT name)

Last updated on 13 Nov 2020
Published on 13 Nov 2020