Effects for Less

Summary of Alexis King - focusing on benchmarks

Summary

Real world benchmarks make it difficult to isolate costs
Microbenchmarks often seen as synthetic
Microbenchmarks need to make sure you’re measuring the right thing
Might not have broad scope.
Effects systems make real world programs hard to benchmark
Effects systems tend to have small operations that do not take a significant amount of time
Splitting in modules slows stuff down
Compiler optimisations lead to cross module slowdowns
Free monad libraries, by constructing trees, obscures the program structure to the optimiser preventing inlining.
For monad transformers, entire overhead can be eliminated by inlining.
GHC does full type erasure
GHC doesn’t do monomorphisation (storing the source in the interface file and generating a function for each combination of concrete types)
Actual way is turning typeclass constraints into higher order functions and passing a record corresponding to the typeclass defintions - called dictionary passing style.
Can’t inline because effects are passed in as arguments in the dictionary.
Known calls expose strictness info.
Unknown calls to bind are a problem

The relevance this has to our problem is that the authors of monad-bayes acknowledged that the overhead was is the inference transformer stack so we need to construct a benchmark that directly measures the cost of operations relating to the inference transformer stack without having too much numerical obfuscation in the way.
A microbenchmark seems appropriate then which tries to make use of the inference transformers.
Need a benchmark where the relation of inference transformer operations to numerical code is an appreciable amount of time.
- Need to see if the cost of some inference transformers is higher than others so it would be good to have different algos that use different inference transformers to see if the general cost of effects is the important part of if the individual transformers are what is slow.
Need to assess the cost of bind as well we need this to be fast.
Need to construct a benchmark that is robust to program changes i.e. cross module / inlining etc.

We are concerned with the performance of the inference transformer system.
We therefore want to construct benchmarks that directly measure the cost of the inference transformer system.
We are currently unaware if this is due to the whole inference transformer system or the performance of the concrete transformers.
It is necessary to construct a suite of benchmarks to determine this
PMMH may be a good candidate for whole system benchmarking as it uses all transformers
We will have to think about tests for the individual concerete transformers.
We need to work out how to evaluate what a given profiler output says - does this mean we need a hand rolled library to see what we get with no effects?
Re the above: how do we work out what costs the most i.e. cost of bind vs inference transformer dispatch
Work out a plan of action based on the answers to the above questions i.e. if the problem is concrete monads we need to come up with faster ones, if the problem is effect dispatch and bind which solution do we try first? This requires a list of the techniques and the relative approaches / advantages.

Last updated on 13 Nov 2020
Published on 13 Nov 2020