The Fibonacci numbers for are defined by

if and otherwise.

In Haskell this becomes

fib n | n <= 1 = n | otherwise = fib (n-1) + fib (n-2)

The translation is almost verbatim. Unfortunately, as a show-case for functional programming, the definition of `fib`

is pretty terrible. The problem can be seen from the evaluation tree for `fib 5`

below.

The tree has leaves. More generally, one can show by induction that the tree for `fib n`

has leaves. Clearly computing by adding up numbers, of which are and are , is not time efficient.

A more minor objection is that the domain of `fib`

is not . In fact, thanks to Haskell’s polymorphism, `fib`

is defined on values of *any* numeric type. For example `fib 1.5`

evaluates to `fib 0.5 + fib (-0.5)`

, which, since both arguments are , evaluates to . It seems remarkable that, without any thought (in fact, with the deliberate absence of thought), we have extended the domain of the Fibonacci function to all numbers understood by Haskell. Anyway, this objection is easily addressed:

fib' :: Integer -> Integer fib' n | n < 0 = undefined | n <= 1 = n | otherwise = fib' (n-1) + fib' (n-2)

Here `undefined`

is the error value, also known as ‘bottom’, that silently inhabits every type.

#### Memoization

To address the serious problem, we need to tell Haskell to remember, or *memoize* the values of `fib`

, rather than recompute them each time. The following idiom works well and is amenable to generalization.

fibM = \n -> values !! n where values = [fibAux m | m <- [0..]] fibAux n | n <= 1 = n | otherwise = fibM (n-2) + fibM (n-1)

To understand why this works, one has to have some understanding of Haskell’s evaluation model. Haskell is *lazy*. Roughly put: no value is evaluated unless its necessary for the computation to proceed. Thus if `fibM`

is evaluated at 30 (say), the apparently infinite list `[fibAux m | m <- [0..]]`

is only evaluated as far as its 31st member. Moreover, since `values`

appears at the top level of the ‘where’ in the definition of `fibM`

, it is shared between all invocations of `fibM`

. (To use the correct terminology, it is a `constant applicative form’.) Sharing a list means sharing all its members, so each list member, and so each value of `fibM`

is computed at most once.

This point is maybe made more clearly by considering a simpler definition, that fails to memoize correctly.

fibM' n | n <= 1 = n | otherwise = values !! (n-1) + values !! (n-2) where values = [fibM' m | m <- [0..]]

The problem is that the list `values`

could potentially depend on (considering replacing `[0..]`

with `[0..n]`

). and so cannot be shared between all invocations of `fibM'`

. The difference with `fibM`

is that the closure for `fibM`

includes `values`

; in contrast, each evaluation of `fibM' n`

creates a new closure, with a new `values`

list. (So `fibM'`

uses quadratic storage, as well as exponential time—what a triumph!)

Writing `fibM`

needs some care. We have to introduce an auxillary `part-memoized’ function, and in turn we have to be careful in `fibAux`

to use memoized values, rather than recursive calls to `fibAux`

. Maybe we could make all this a one-time effort, by writing a function `memoize :: (a -> a) -> (a -> a)`

, such that `memoize fib`

evaluates to a memoized version of `fib`

?

A bit of thought suggests this is too ambitious. For example, what if the type `a`

does not support testing for equality? Then we cannot even know if we are evaluating the memoized function on an argument already seen. But surely, we can write a memoizer for functions with domain and range the natural numbers less than , with type signature `memoizeInt :: (Int -> Int) -> (Int -> Int)`

?

I don’t understand Haskell or functional programming well enough to be completely confident here, but my belief is that no such memoizer can be defined. Consider the following failed attempt.

memoizeInt :: (Int -> Int) -> (Int -> Int) memoizeInt f = \m -> values !! m where values = [f m | m <- [0..]]

This is a useful memoizer for a function such as `fac n = product [1..n]`

that may require a lot of work for each evaluation, but do not call themselves recursively. For instance, in ghci 7.8.4 on my laptop, evaluating `fac (10^7)`

takes 4.5s; after defining `facM = memoizeInt fac`

, the first evaluation of `facM (10^7)`

takes 11s, and subsequent evaluations of `facM (10^7)`

are instantaneous. (The result is in every case 0, because of the limited range of the `Int`

type.)

The problem in `fibM'' = memoizeInt fib`

is that when Haskell evaluates `fibM'' n`

, the recursive evaluations `fib (n-2)`

and `fib (n-1)`

call the original `fib`

function, not the memoized function `fibM''`

. We would like to somehow ‘reach into’ the definition of `fib`

so that the recursive calls are redirected, but the immutability of Haskell values makes this impossible.

#### Recursive functions as fixed points

The difficulty seems to be with functions that call themselves. Ignoring memoization, how do compilers cope such functions? We digress to consider a strategy sheds some light on the memoization problem. Consider the function `fibBuilder`

below.

fibBuilder :: (Int -> Int) -> (Int -> Int) fibBuilder f = g where g n | n <= 1 = f n | otherwise = f (n-2) + f (n-1)

The input of `fibBuilder`

is a function. The output is another function which we shall see is a ‘better approximation’ to a correct Fibonacci function. Consider for example

[id m | m <- [0,1,2,3] [g m | m <- [0,1,2,3]] where g = fibBuilder id [h m | m <- [0,1,2,3]] where h = (fibBuilder (fibBuilder id))

The output for the identity function `id`

is clearly `[0,1,2,3]`

. Chasing through the evaluations of `g`

we get `g 2`

`id 0 + id 1 `

`0 + 1`

`1`

and `g 3 `

`id 1 + id 2`

`1 + 2`

`3`

. Hence `h 3`

`g 2 + g 1 where g = fibBuilder id`

evaluates to ` 1 + 1`

, and so to `2`

. Thus `g`

agrees with `fib`

on 0, 1, 2 and `h`

agrees with `fib`

on 0, 1, 2, 3.

An easy inductive argument shows that a function defined, like `g`

and `h`

, but with calls to `fibBuilder`

agrees with `fib`

on . We want to somehow ‘pass to the limit’, and get the function `fibInf`

defined by calling `fibBuilder`

infinitely many times. So `fibInf`

should be the same as `fibBuilder fibInf`

. By analogy with the infinite list `ones = 1 : ones`

, we boldly define

fibInf = fibBuilder fibInf

The result is disappointing: no evaluation of `fibInf`

ever terminates. For example, `fibInf 3`

`(fibBuilder fibInf) 3`

`fibInf 2 + fibInf 1`

`(fibBuilder fibInf) 2 + fibInf 1`

`fibInf 1 + fibInf 0 + fibInf 1`

. Now the evaluation gets stuck, since each summand evaluates to itself. But there is a simple solution: extend the builder function so that it knows the initial conditions for the Fibonacci numbers, as well as the pattern of recursion. (This amounts to deleting one `f`

in the original definition.)

fibBuilder' :: (Int -> Int) -> (Int -> Int) fibBuilder' f = g where g n | n <= 1 = n | otherwise = f (n-2) + f (n-1) fibInf' = fibBuilder' fibInf'

This works. While non-obvious to define, neither of the functions above calls itself recursively. So, at least in this case, we’ve seen that a compiler that can cope with functions that return other functions can cope with a truly recursive function.

An alternative definition, emphasising that `fibInf`

and `fibBuilder fibInf`

are the same is

fibInf' = fix fixBuilder' where fix k = f (fix k)

Thus the defining property of `fibInf'`

also serves as a working Haskell definition. For more theory on the fixed point combinator see the Wikipedia page. If there was a prize for ‘function that seems the most useless, and is in fact the most useful’, `fix`

would be a strong contender.

#### Memoizing builder functions

In the same spirit of ‘defining property as working definition’, we might try to define a memoized version of `fib`

by

fibMInf = fibBuilder' (memoizeInt fibMInf)

Remarkably enough, this works. For example, chasing through the evaluation of `fibMInf 3`

we get `fibMInf 3`

`fibBuilder' (memoizeInt fibMInf) 3`

`f 1 + f 2 where f = memoizeInt fibMInf`

. The closure for `f`

has the list `values`

, defined by `values = [f m | m <- [0..]]`

. So, as if by magic, we now call a memoized function in the recursive evaluations.

So although we haven’t been able to define a good memoizer with type `(Int -> Int) -> (Int -> Int)`

, we *can* use the naive attempt above to write a good memoizer for builder functions.

memoizeBuilder :: ((Int -> Int) -> (Int -> Int)) -> (Int -> Int) memoizeBuilder builder = fix (builder . memoizeInt) fibMInf' = memoizeBuilder fibBuilder'