Staging
Staging and metaprogramming are the same; Template Haskell is a metaprogramming library for Haskell. We have different stages of computation, so staging a program is making it so that it compiles different parts of the program at different stages to be used by other parts of a program at a later stage.
Building blocks of Template Haskell
The core mechanisms of Template Haskell are:
- Evaluating Haskell meta-programs at compile-time and splicing in the generated object programs as regular Haskell code
- Representing Template Haskell object programs as algebraic data types
- The quotation monad
Q
• Writing a meta-program
As an introductory example, consider the curry
function
curry :: ((a,b) -> c) -> a -> b -> c
Unfortunately, there are no Prelude functions that provide the same currying functionality for functions taking arbitrary n-tuples. Manually writing these is tedious - instead we wish to generate functions through a single meta program on demand. Template Haskell lets us do this.
The idea is to write a meta function curryN :: Int -> Q Exp
which, given a number n >= 1, constructs the source code for an n-ary curry function:
{-# LANGUAGE TemplateHaskell #-}
import Control.Monad
import Language.Haskell.TH
curryN :: Int -> Q Exp
curryN n = do
f <- newName "f"
xs <- replicateM n (newName "x")
let args = map VarP (f:xs)
ntup = TupE (map VarE xs)
return $ LamE args (AppE (VarE f) ntup)
- The meta function
curryN
takes an input n and builds a lambda abstractionLamE
that first pattern matches against a functionf
and n argument variablesx1, x2, ..., xn
, and then applies functionf
to the n-tuple(x1, x2, ..., xn)
. - The function
newName
is used to monadically generate the names used to refer to the variablesf
andx1
throughxn
, such that these names are always fresh. Hence the value returned bycurryN
is a monadic computation of typeQ Exp
. - When executed, this monadic computation yields an expression
Exp
representing the object program of an n-arycurry
function.
For example, curryN 3
returns a monadic computation that yields an expression representing a curry3
function of type ((a, b, c) -> d) -> a -> b -> c -> d
in abstract syntax.
• Running a meta-program
To run a meta program like curryN
at compile-time, we enclose it with Template Haskell’s splice operator $
, by writing:
$(curryN 3)
This evaluates the meta program curryN
and puts the resulting object program \f x1 x2 x3 -> f (x1, x2, x3)
in place of the splice.
In general, the splice operator $
can be applied to any monadic Q
computation, resulting in performing this computation at compile-time and inserting the resulting object program as real Haskell code.
Meta programs are type-checked beforehand to yield a valid Template Haskell object program, therefore only imported, fully type-checked meta programs can be run via the splice operator $
. In particular, we have to evaluate the meta program $(curryN 3)
in a separate module to when curryN
is defined.
To generate function declarations for the first n curry functions, we can devise a further meta program on top of curryN
:
genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
where mkCurryDec ith = do
cury <- curryN ith
let name = mkName $ curry" ++ show ith
return $ FunD name [Clause [] (NormalB cury) []]
- In this case, the function
genCurries
will return a list of top-level function declarations that bind the anonymous lambda abstractions built bycurryN
. - Also, the function
mkName
is used instead ofnewName
, because we want to generate functionscurry1
tocurry20
with exactly the prescribed names, so they can be captured and referred to from other parts of the program.
Running $(genCurries 20)
will then splice in the first 20 curry functions at compile-time:
curry1 = \ f x1 -> f (x1)
curry2 = \ f x1 x2 -> f (x1, x2)
curry3 = \ f x1 x2 x3 -> f (x1, x2, x3)
curry4 = \ f x1 x2 x3 x4 -> f (x1, x2, x3, x4)
...
curry20 = \ f x1 x2 ... x20 -> f (x1, x2, ..., x20)
• Object programs as algebraic data types
Object programs created by Template Haskell are represented as regular algebraic data types, describing a program in the form of an abstract syntax tree. The Template Haskell library provides algebraic data types Exp
, Pat
, Dec
, and Type
to represent Haskell’s surface-level syntax of expressions, patterns, declarations, and types.
Virtually all concrete Haskell syntactic constructs have a corresponding abstract syntax constructor in one of these four ADTs. Furthermore, all Haskell identifiers are represented by the Name
data type. By representing object programs as regular ADTs, normal Haskell can be used as the meta-programming language to build object programs.
• Quotation Monad Q
Template Haskell object programs are built inside the quotation monad Q
. This monad is performed by the splice operator $
at compile-time as part of evaluating the meta program.
Generation of code requires certain features to be available to use:
- The ability to generate fresh names that cannot be captured.
- The ability to retrieve information about something by its name.
- The ability to put and get some custom state that is then shared by all Template Haskell code in the same module.
- The ability to run
IO
during compilation so that we can, for example, read something from a file.
The quotation monad Q
hosts all the functions provided by Template Haskell.
The process of Template Haskell:
Template Haskell’s core functionality constitutes evaluating object programs with splice $
and building them from algebraic data types inside the quotation monad Q
. However, constructing object programs in terms of their abstract syntax trees is verbose and leads to clumsy meta programs. Therefore, the Template Haskell API also provides two further interfaces to build object programs more conveniently:
- Syntax construction functions
- Quotation brackets
• Syntax construction functions
For every syntax constructor, there is a corresponding monadic syntax construction function provided. Syntax construction functions directly relate to the syntax constructors from the algebraic data types Exp
, Pat
, Dec
, and Type
for representing Haskell code. However, they also let us hide the monadic nature when building object programs.
To elaborate, recall the definition of the meta function genCurries
:
genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
where mkCurryDec ith = do
cury <- curryN ith
let name = mkName $ curry" ++ show ith
return $ FunD name [Clause [] (NormalB cury) []]
To use the object program generated by the sub call to curryN
in the larger context of the returned function declaration, we have to first perform curryN
and bind its result to cury
. This is because we have to account for curryN
’s generation of fresh names before we can continue.
However, using syntax construction functions instead of data constructors abstracts from the monadic construction of genCurries
, thus making its code a little shorter.
genCurries :: Int -> Q [Dec]
genCurries n = forM [1..n] mkCurryDec
where mkCurryDec ith = funD name [clause [] (normalB (curryN ith)) []]
where name = mkName $ "curry" ++ show ith
The new funD
, clause
, and normalB
functions directly correspond to the formerly used FunD
, Clause
, and NormalB
constructors. The only difference is their types:
FunD :: Name -> [Clause] -> Dec
funD :: Name -> [Q Clause] -> Q Dec
Clause :: [Pat] -> Body -> Clause
clause :: [Q Pat] -> Q Body -> Q Clause
NormalB :: Exp -> Body
normalB :: Q Exp -> Q Body
The syntax constructors (FunD
, Clause
, NormalB
) work with raw Template Haskell expressions.
The syntax constructor functions (funD
, clause
, normalB
) work with the monadic counterparts of raw Template Haskell expressions. They construct a Template Haskell object program directly in Q
- this frees the API consumer from doing the monadic wrapping and unwrapping manually.
• Quotation brackets
Quotation brackets are a method to further simplify the representation of Haskell code. They let us specify an object program using regular Haskell syntax, by enclosing it inside oxford brackets:
[| .. |]
With this, object programs can be specified much more succinctly.
-
For example, using either ADTs or syntax construction functions to express a meta program which builds a Haskell expression for the identity function is quite verbose:
genId :: Q Exp genId = do x <- newName "x" lamE [varP x] (varE x)
Whereas using quotation brackets, writing the same meta program can be simplified to:
genId :: Q Exp genId = [| \x -> x |]
Quotation brackets quote regular Haskell code as the corresponding object program fragments inside the Q
monad. There are quotation brackets for quoting Haskell expressions ([e| .. |]|
), patterns ([p| .. |]
), declarations ([d| .. |]
), and types ([t| .. |]
). Writing [| .. |]
is hereby just another way of saying [e| .. |]
.
Using quotation brackets “lifts” concrete Haskell syntax into corresponding object program expressions inside the Q
monad. By doing so, quotation brackets represent the dual of the already introduced splice operator $
: evaluating a meta program with $
splices in the generated object program as real Haskell code; in contrast, quotation brackets [| .. |]
turn real Haskell code into an object program. Consequently, quotation brackets and the splice operator cancel each other out. The equation $([| e |]) = e
holds for all expressions e
, and similar equations hold for declarations, and types.
• Reification
A major feature of Template Haskell is program reification. Reification allows a meta program to query compile-time information about other program parts whilst constructing the object program. This allows the meta program to inspect other program components to answer questions such as:
- What is this variable’s type?
- What are the class instances of this type class?
- Which constructors does this data type have and what do they look like?
The main use-case of reification is to generate boilerplate code which auto-completes manually written code.
An example of this, is to generically derive type class instances from bare data type definitions. Suppose we define the following polymorphic data types for representing potentially erroneous values, lists, and binary trees:
data Result e a = Err e | Ok a
data List a = Nil | Cons a (List a)
data Tree a = Leaf a | Node (Tree a) a (Tree a)
Then, suppose we want to derive Functor
instances for all of these types automatically. To make a type constructor T
an instance of Functor
, we need to implement the method fmap :: (a -> b) -> T a -> T b
. Its definition is precisely determined by parametricity and the functor laws:
- By parametricity, all values of type
a
must be replaced according to the provided function with values of typeb
. - By the functor laws, all other shapes of the input value of type
T a
must be preserved when transforming it to the output value of typeT b
.
The meta function deriveFunctor :: Name -> Q [Dec]
implements the idea of this algorithm:
data Deriving = Deriving { tyCon :: Name, tyVar :: Name }
deriveFunctor :: Name -> Q [Dec]
deriveFunctor ty
= do (TyConI tyCon) <- reify ty
(tyConName, tyVars, cs) <- case tyCon of
DataD _ nm tyVars cs _ -> return (nm, tyVars, cs)
NewtypeD _ nm tyVars c _ -> return (nm, tyVars, [c])
_ -> fail "deriveFunctor: tyCon may not be a type synonym."
let (KindedTV tyVar StarT) = last tyVars
instanceType = conT ''Functor `appT`
(foldl apply (conT tyConName) (init tyVars))
putQ $ Deriving tyConName tyVar
sequence [instanceD (return []) instanceType [genFmap cs]]
where
apply t (PlainTV name) = appT t (varT name)
apply t (KindedTV name _) = appT t (varT name)