Specialisation

Type class constraints in Haskell are implemented via dictionary passing. A function accepting a type class constraint is passed a record which provides evidence for the constraint. Therefore despite the fact that all instances have been statically resolved before runtime, the dictionaries are still passed to functions which can incur significant overhead.

Since type classes are ubiquitous, getting rid of this overhead is critical for any high-performance Haskell programs. Therefore one of the most important optimisations in the compiler is specialisation which rewrites functions to remove the overhead of passing a statically known dictionary. Specialisation is the process of removing typeclass dictionary arguments by creating a new type-specialised definition for an overloaded function. Once specialised, dictionary methods can be easily inlined which usually creates more efficient code.

For example, if we define the overloaded function foo:

foo :: Show a => a -> a -> Bool
foo x y = show x == show y

Then the following core definition will be produced:

foo = \ @a $dShow x y ->
    eqString (show $dShow x) (show $dShow y)

There are now 4 parameters to foo:

The first argument is a type (denoted by @)
The second argument is the dictionary for Show (denoted by the $d prefix)
The last two are the arguments x and y of the original definition

The class constraint is translated into a dictionary. Each time a class method is used, we must dynamically lookup which definition to use in the supplied class dictionary.

If we know which type a is instantiated with, we can specialise the definition of foo and produce much better code.

qux :: Bool -> Bool -> Bool
qux = foo @Bool

Using foo at a specific type produces a new definition foo_$sfoo which is defined as:

foo_$sfoo :: Bool -> Bool -> Bool
foo_$sfoo = foo @Bool $fShowBool

The Process of Specialisation

First, the body of the program is traversed in order to find functions applied to a statically known dictionary argument.

$dDictShowInt :: Show Int

f :: Show a -> ...

g = ... f @Int $dDictShowInt ...

For each occurrence, a new top-level function is created whose arity is one less than the original function. The definition of this top-level function is the original function applied to just the static dictionary argument.

f_spec = f @Int $dDictShowInt

Finally a rewrite rule is emitted which rewrites the original function application to the new specialised version.

{-# RULE f @Int $dDictShowInt = f_spec }

Relationship with Inlining

Specialisation is very closely related to inlining. Another way to implement specialisation would be directly inline the overloaded function at each call site where there was a statically known dictionary.

The advantages of specialisation are:

It usually improves the program because the specialised function is applied to a statically known argument. In general inlining is a heuristic which may not succeed in removing any static information. The argument discount is a generalised version of the specialisation check.
Secondly, code bloat is avoided by processing a whole module at once. Specialisations are shared throughout a whole program so each function is only specialised for each type once. This contrasts with inlining where the decision is made locally to a call-site and work is not shared between decisions.
Finally, recursive functions are specialised but not inlined. Therefore the only way to eliminate an argument from a recursive function is if the argument is a type class dictionary. Specialisation is mainly of use for recursive functions as for non-recursive functions inlining can play a similar role. However, for recursive functions, if they are not specialised then the overhead of passing the static argument will be paid on each iteration.

Trick : Inlining recursive definitions with static arguments

Recall that we stated how GHC will never inline a recursive function. There is a technique where we can use specialisation to force GHC to unroll a finite amount of recursion.

Consider the problem of applying a function f k times to an argument – if k is known statically, then the loop can be unrolled to k applications of f.

nTimes 0 (+1) 0 ≡ 0
nTimes 4 (+1) 0 ≡ (+1) ((+1) ((+1) ((+1) 0)))

Writing the function naively is straightforward, but this will not be specialised for a statically known k, because it will not be inlined due to being self-recursive.

nTimes :: Int -> (a -> a) -> a -> a
nTimes 0 f x = x
nTimes k f x = f (nTimes (k - 1) f x)

We can use specialisation to define a version of nTimes which is unrolled. For specialisation to happen, the argument to nTimes must be a type class dictionary - this means defining nTimes using a typeclass (Unroll) which is parameterised by a type representing natural numbers. An instance is defined for the case case and inductive case which corresponds to the two cases of nTimes given above.

data Proxy (t :: k) = Proxy

data N = Z | S N

class Unroll (n :: N) where
  nTimes :: Proxy n → (a → a) → a → a

instance Unroll Z where
  nTimes f x = x

instance Unroll n ⇒ Unroll (S n) where
  nTimes p f x =
    let Proxy :: Proxy (S n) = p
    in f (nTimes (Proxy :: Proxy n) f x)

Now the argument to nTimes is essentially a type of kind N passed using the Proxy constructor.

oneTime :: (a → a) → a → a
oneTime = nTimes (Proxy :: Proxy (S Z))

fiveTimes :: (a → a) → a → a
fiveTimes = nTimes (Proxy :: Proxy (S (S (S (S (S Z))))))

Now specialisation can work on nTimes because the argument is a dictionary. In oneTime, first a specialisation of oneTime to the S Z dictionary is created before a specialisation to the Z dictionary. The eventual result is an unrolled pipeline of functions. GHC is happy to do this repeated specialisation because the constraint language ensures that the type class evidence is finite and we can not enter situations where specialisation would happen forever.

Last updated on 13 Nov 2020
Published on 13 Nov 2020