The Fibonacci numbers for are defined by
if and otherwise.
In Haskell this becomes
fib n | n <= 1 = n | otherwise = fib (n-1) + fib (n-2)
The translation is almost verbatim. Unfortunately, as a show-case for functional programming, the definition of
fib is pretty terrible. The problem can be seen from the evaluation tree for
fib 5 below.
The tree has leaves. More generally, one can show by induction that the tree for
fib n has leaves. Clearly computing by adding up numbers, of which are and are , is not time efficient.
A more minor objection is that the domain of
fib is not . In fact, thanks to Haskell’s polymorphism,
fib is defined on values of any numeric type. For example
fib 1.5 evaluates to
fib 0.5 + fib (-0.5), which, since both arguments are , evaluates to . It seems remarkable that, without any thought (in fact, with the deliberate absence of thought), we have extended the domain of the Fibonacci function to all numbers understood by Haskell. Anyway, this objection is easily addressed:
fib' :: Integer -> Integer fib' n | n < 0 = undefined | n <= 1 = n | otherwise = fib' (n-1) + fib' (n-2)
undefined is the error value, also known as ‘bottom’, that silently inhabits every type.
To address the serious problem, we need to tell Haskell to remember, or memoize the values of
fib, rather than recompute them each time. The following idiom works well and is amenable to generalization.
fibM = \n -> values !! n where values = [fibAux m | m <- [0..]] fibAux n | n <= 1 = n | otherwise = fibM (n-2) + fibM (n-1)
To understand why this works, one has to have some understanding of Haskell’s evaluation model. Haskell is lazy. Roughly put: no value is evaluated unless its necessary for the computation to proceed. Thus if
fibM is evaluated at 30 (say), the apparently infinite list
[fibAux m | m <- [0..]] is only evaluated as far as its 31st member. Moreover, since
values appears at the top level of the ‘where’ in the definition of
fibM, it is shared between all invocations of
fibM. (To use the correct terminology, it is a `constant applicative form’.) Sharing a list means sharing all its members, so each list member, and so each value of
fibM is computed at most once.
This point is maybe made more clearly by considering a simpler definition, that fails to memoize correctly.
fibM' n | n <= 1 = n | otherwise = values !! (n-1) + values !! (n-2) where values = [fibM' m | m <- [0..]]
The problem is that the list
values could potentially depend on (considering replacing
[0..n]). and so cannot be shared between all invocations of
fibM'. The difference with
fibM is that the closure for
values; in contrast, each evaluation of
fibM' n creates a new closure, with a new
values list. (So
fibM' uses exponential storage, as well as exponential time—what a triumph!)
fibM needs some care. We have to introduce an auxillary `part-memoized’ function, and in turn we have to be careful in
fibAux to use memoized values, rather than recursive calls to
fibAux. Maybe we could make all this a one-time effort, by writing a function
memoize :: (a -> a) -> (a -> a), such that
memoize fib evaluates to a memoized version of
A bit of thought suggests this is too ambitious. For example, what if the type
a does not support testing for equality? Then we cannot even know if we are evaluating the memoized function on an argument already seen. But surely, we can write a memoizer for functions with domain and range the natural numbers less than , with type signature
memoizeInt :: (Int -> Int) -> (Int -> Int)?
I don’t understand Haskell or functional programming well enough to be completely confident here, but my belief is that no such memoizer can be defined. Consider the following failed attempt.
memoizeInt :: (Int -> Int) -> (Int -> Int) memoizeInt f = \m -> values !! m where values = [f m | m <- [0..]]
This is a useful memoizer for a function such as
fac n = product [1..n] that may require a lot of work for each evaluation, but do not call themselves recursively. For instance, in ghci 7.8.4 on my laptop, evaluating
fac (10^7) takes 4.5s; after defining
facM = memoizeInt fac, the first evaluation of
facM (10^7) takes 11s, and subsequent evaluations of
facM (10^7) are instantaneous. (The result is in every case 0, because of the limited range of the
The problem in
fibM'' = memoizeInt fib is that when Haskell evaluates
fibM'' n, the recursive evaluations
fib (n-2) and
fib (n-1) call the original
fib function, not the memoized function
fibM''. We would like to somehow ‘reach into’ the definition of
fib so that the recursive calls are redirected, but the immutability of Haskell values makes this impossible.
Recursive functions as fixed points
The difficulty seems to be with functions that call themselves. Ignoring memoization, how do compilers cope such functions? We digress to consider a strategy that sheds some light on the memoization problem. Consider the function
fibBuilder :: (Int -> Int) -> (Int -> Int) fibBuilder f = g where g n | n <= 1 = f n | otherwise = f (n-2) + f (n-1)
The input of
fibBuilder is a function. The output is another function which we shall see is a ‘better approximation’ to a correct Fibonacci function. Consider for example
[id m | m <- [0,1,2,3] [g m | m <- [0,1,2,3]] where g = fibBuilder id [h m | m <- [0,1,2,3]] where h = (fibBuilder (fibBuilder id))
The output for the identity function
id is clearly
[0,1,2,3]. Chasing through the evaluations of
g we get
id 0 + id 1
0 + 1
id 1 + id 2
1 + 2
g 2 + g 1 where g = fibBuilder id evaluates to
1 + 1, and so to
g agrees with
fib on 0, 1, 2 and
h agrees with
fib on 0, 1, 2, 3.
An easy inductive argument shows that a function defined, like
h, but with calls to
fibBuilder agrees with
fib on . We want to somehow ‘pass to the limit’, and get the function
fibInf defined by calling
fibBuilder infinitely many times. So
fibInf should be the same as
fibBuilder fibInf. By analogy with the infinite list
ones = 1 : ones, we boldly define
fibInf = fibBuilder fibInf
The result is disappointing: no evaluation of
fibInf ever terminates. For example,
(fibBuilder fibInf) 3
fibInf 2 + fibInf 1
(fibBuilder fibInf) 2 + fibInf 1
fibInf 1 + fibInf 0 + fibInf 1. Now the evaluation gets stuck, since each summand evaluates to itself. But there is a simple solution: extend the builder function so that it knows the initial conditions for the Fibonacci numbers, as well as the pattern of recursion. (This amounts to deleting one
f in the original definition.)
fibBuilder' :: (Int -> Int) -> (Int -> Int) fibBuilder' f = g where g n | n <= 1 = n | otherwise = f (n-2) + f (n-1) fibInf' = fibBuilder' fibInf'
This works. While non-obvious to define, neither of the functions above calls itself recursively. So, at least in this case, we’ve seen that a compiler that can cope with functions that return other functions can cope with a truly recursive function.
An alternative definition, emphasising that
fibBuilder fibInf are the same is
fibInf' = fix fixBuilder' where fix f = f (fix f)
(Corrected, see comment below.) Thus the defining property of
fibInf' also serves as a working Haskell definition. For more theory on the fixed point combinator see the Wikipedia page. If there was a prize for ‘function that seems the most useless, and is in fact the most useful’,
fix would be a strong contender.
Memoizing builder functions
In the same spirit of ‘defining property as working definition’, we might try to define a memoized version of
fibMInf = fibBuilder' (memoizeInt fibMInf)
Remarkably enough, this works. For example, chasing through the evaluation of
fibMInf 3 we get
fibBuilder' (memoizeInt fibMInf) 3
f 1 + f 2 where f = memoizeInt fibMInf. The closure for
f has the list
values, defined by
values = [f m | m <- [0..]]. So, as if by magic, we now call a memoized function in the recursive evaluations.
So although we haven’t been able to define a good memoizer with type
(Int -> Int) -> (Int -> Int), we can use the naive attempt above to write a good memoizer for builder functions.
memoizeBuilder :: ((Int -> Int) -> (Int -> Int)) -> (Int -> Int) memoizeBuilder builder = fix (builder . memoizeInt) fibMInf' = memoizeBuilder fibBuilder'