Tail as a Fold

Learning new things is great, but it's also important to go over old things, and see what you've missed. In that spirit, I was recently re-reading “A Tutorial on the Universality and Expressiveness of Fold”(pdf) by Graham Hutton, and came across a particularly interesting snippet

Note that the use of tupling to define primitive recursive functions in terms of fold is precisely the key to defining the predecessor function for the Church numerals (Barendregt, 1984). Indeed, the intuition behind the representation of the natural numbers (or more generally, any inductive datatype) in the λ-calculus is the idea of representing each number by its fold operator.

This statement got me a little excited, because I had been previously intimidated by the predecessor function for Church numerals, and had no idea how I could have derived it for myself. But before deriving this, we first need to know: what are Church Numerals?

Church numerals are a representation of a natural number (0,1,2, and so on) as a function of two arguments f and x, where f is applied to x a number of times equal to that number. The first few natural numbers represented as Church numerals are

{-# LANGUAGE Rank2Types #-}
import Prelude hiding (succ,pred)
newtype Church = Church (forall a. (a -> a) -> a -> a)

zero,one,two,three :: Church
zero  = Church $ \f x -> x
one   = Church $ \f x -> f x
two   = Church $ \f x -> f (f x)
three = Church $ \f x -> f (f (f x))

It's easy to obtain the corresponding natural number from the Church representation, by making f be the successor function for the naturals, and x zero.

churchToNatural :: Church -> Integer
churchToNatural (Church c) = c (1+) 0

If instead we take lists of length n as our representation of natural numbers, we can then see that Church numerals can be obtained by, as Hutton says, thinking of them as the fold over that list.

listToChurch :: [a] -> Church
listToChurch l = Church $ \f n -> foldr (\_ p -> (f p)) n l
zero' = listToChurch []
one'  = listToChurch [1] -- use of integers is arbitrary
two'  = listToChurch [1,1]

So far, none of this was new to me, but the interesting part was the first sentence. Thinking of Church numerals as lists, what Hutton has said is effectively “the use of fold and tupling is the key to defining the predecessor (tail) function on lists”, or more simply “tail can be written as a fold”.

Looking at the definition of tail this statement is quite baffling.

tail :: [a] -> [a]
tail []     = error "can't take tail of empty list"
tail (x:xs) = xs

How does fold enter into the equation there at all?

Well, as outlined by Hutton, we can use the “universal property” of fold as a method for deriving definitions. In this case, we want to solve the equation tail = fold f v for some function f and some value v, or equivalently, we want to solve the two equations

tail []     = v
tail (x:xs) = f x (tail xs)

Immediately, we realise that v = error "can't take tail of empty list", and for the second case we try

tail (x:xs) = f x (tail xs)
xs          = f x (tail xs) -- by definition of tail
xs          = f x y         -- abstracting (tail xs) to y

and we hit a snag: in the function argument to fold we don't have access to xs directly, but only the result tail xs.

The trick then, as mentioned above, is to use tuples, and define a new function

tail' xs = (tail xs, xs)

Trying again, we want to solve tail' = fold f v for some function f and value v, which is equivalent (by universality of fold) to

tail' []     = v
tail' (x:xs) = f x (tail' xs)

A simple calculation for the first case gives us v = (error "empty list", []) and for the second case

        tail' (x:xs)  = f x (tail' xs)
(tail (x:xs), (x:xs)) = f x (tail xs, xs) -- by the definition of tail'
         (xs, (x:xs)) = f x (tail xs, xs) -- by definition of tail
         (xs, (x:xs)) = f x (ys, xs)      -- abstracting (tail xs) to ys
                    f = \x (ys, xs) -> (xs, (x:xs)) -- rearranging

which is a valid definition, and so we can conclude that

tail' = foldr (\x (ys, xs) -> (xs, x:xs)) (error "empty list", [])

and since we no longer use tail in the definition of tail' we are free to redefine tail in terms of tail'.

tail = fst . tail'
-- or equivalently
tail = fst . foldr (\ x (_ , xs) -> (xs, x:xs)) (error "empty list", [])

So, there you have it: tail in terms of foldr. Quite crazy really.

Bringing us back to the Lambda Calculus, we can quite simply define our predecessor function by converting to a list form, taking the tail, then converting back

churchToList :: Church -> [Integer]
-- the use of integers here is completely arbitrary
churchToList (Church c) = c (1:) []

pred' :: Church -> Church
pred' = listToChurch . tail . churchToList

However, the two conversions are unnecessary if we remember our earlier intuition of a Church numeral as a “fold”. Translating our tail, we replace the empty list with zero, and since the value we chose when converting Church numerals to lists was completely arbitrary, we ignore the list value, and convert (x:) to the successor function on Church numerals, succ.

pred :: Church -> Church
pred (Church n) = fst $ n (\(_, xs) -> (xs, succ xs)) (error "empty list", zero)

succ :: Church -> Church
succ (Church n) = Church $ \f x -> f (n f x)

Going further, we can object that the pure Lambda Calculus doesn't have pairs, or an error function, but we can church encode the pairs, and represent errors by a non-terminating expression like Omega, but I leave that to the reader.

If you would like to play with this, the haskell code can be obtained as a gist. Thanks to shachaf and mm_freak on freenode's #haskell channel for helping with a typing issue I had.