Haskell Core

• Core Language

The definition of the core language of GHC is given below - all source programs are compiled into this core language. All optimisation passes are in terms of this core language rather than the source programs.

data Expr b
    = Var Id
    | Lit Literal
    | App (Expr b) (Arg b)
    | Lam b (Expr b)
    | Let (Bind b) (Expr b)
    | Case (Expr b) b Type [Alt b]
    | Cast (Expr b) Coercion
    | Tick (Tickish Id) (Expr b)
    | Type Type
    | Coercion Coercion
type Arg b = Expr b
type Alt b = (AltCon, [b], Expr b)
data AltCon
    = DataAlt DataCon
    | LitAlt Literal
    | DEFAULT
data Bind b
    = NonRec b (Expr b)
    | Rec [(b, (Expr b))]

• Basic Optimisations on Core

There are several optimisations performed on a core program which are always beneficial as they correspond to a normal reduction step but performed during compilation rather than at runtime and reduce program size. These kinds of transformations are important to distinguish from the heuristic based optimisations such as inlining and specialisation because we can rely on them happening.

β-reduction: The simplest optimisation is to perform one-step of β-reduction and reduce a lambda expression by substituting the argument into its body.

(λx → e) y ⇝ e [x / y]

Redexes can often appear in generated code if we have a code generator which returns a function type. For example, by composing the generator for the identity function with a quoted unit value, the resulting program will be the identity function applied to the unit, which can be reduced by β-reduction.
```
qid   = ⟦ λx → x ⟧
qunit = ⟦ () ⟧
qres  = $(⟦ ($(qid) $(qunit)) ⟧)
      ⇝ (λx → x) ()
      ⇝ ()
```
Case of Known Constructor: Case expressions can be reduced if the scrutinee evaluates to a known constructor. For example, when casing on a boolean, if the scrutinee is known to be True then the case expression can perform one step of reduction and select the first branch.
```
case True of
  True  → False
  False → True
  ⇝
  False
```
When writing a compositional code generator it is common to parametrise the generator on the value of the scrutinee.
```
genNot :: Code Bool → Code Bool
genNot x = ⟦ case $(x) of
                True  → False
                False → True  ⟧
```
Then if genNot is applied to ⟦ True ⟧ then the generated program will result in a case of known constructor optimisation opportunity.
Case-of-case: If a case appears in the scrutinee position (the thing being examined) then a program is often simplified by reassociating the expression so that the case is pushed into both branches. In the composition not ◦ not, the transformation unveils an opportunity of “case of known constructor” which allows the program to be simplified to the identity function.
```
not (not x)
⇝
case (case x of {True → False; False → True}) of
  True  → False
  False → True
⇝
case x of
  True  → case False of { True → False; False → True }
  False → case True  of { True → False; False → True }
⇝
case x of
  True  → True
  False → False
```
Any program which generates cases will often end up using cases in the scrutinee position so this automatic rewriting can produce a much better final result.
Nested Case: Collapsing multiple case expressions which scrutinise the same value can result in a simpler core program. In the following example, the nested case expression could have been written as a single case expression with three branches.
```
case x of
  'a' → False
  _   → case x of
  'b' → False
  _   → True
⇝
case x of
  'a' → False
  'b' → False
  _   → True
```

Last updated on 13 Nov 2020
Published on 13 Nov 2020