Functor, Foldable, and Traversable Over Binary Tree

I spent some time playing with some fundamental abstractions over the past few days. I was curious about Functor, Foldable, and Traversable, in particular. Towards this end, I explored what each of these means as well as an implementation of their corresponding interfaces for a Tree data type. Here’s the type I worked with:

data Tree a
  = Empty
  | Leaf a
  | Node (Tree a) a (Tree a) deriving Show

It’s a binary tree with data stored at leaf and internal nodes.

Functor

Functor gives us the ability to apply a function over an arbitrary structure. The map function from Prelude is a specialization of Functor’s fmap on lists. A few other instances already defined for us include:

Maybe
Either
IO

Here’s what the interface looks like:

class Functor (f :: * -> *) where
  fmap :: (a -> b) -> f a -> f b
  (<$) :: a -> f b -> f a

(<$>) = fmap

What does it mean to apply a function over a Tree? Ideally, it’d affect all nodes of the Tree. This is accomplished as follows:

instance Functor Tree where
    fmap _ Empty = Empty
    fmap f (Leaf a) = Leaf (f a)
    fmap f (Node l x r) = Node (fmap f l) (f x) (fmap f r)

A touch of pattern matching makes this easy enough to accomplish! With the instance implemented, now we can map pure functions over our Trees:

> fmap (+1) $ Node (Leaf 1) 2 (Leaf 3)
Node (Leaf 2) 3 (Leaf 4)

Foldable

Foldable is an abstraction that represents the ability to summarize a structure as a value using a pure, binary function. Prelude exports foldl, foldr, foldl1, foldr1 that are specialized for lists. Let’s take a look at the more general version:

class Foldable (t :: * -> *) where
  fold :: Monoid m => t m -> m
  foldMap :: Monoid m => (a -> m) -> t a -> m
  foldr :: (a -> b -> b) -> b -> t a -> b
  foldr' :: (a -> b -> b) -> b -> t a -> b
  foldl :: (b -> a -> b) -> b -> t a -> b
  foldl' :: (b -> a -> b) -> b -> t a -> b
  foldr1 :: (a -> a -> a) -> t a -> a
  foldl1 :: (a -> a -> a) -> t a -> a

It looks pretty intimidating! Thankfully, if you implement either of foldMap or foldr, the rest of the interface is implemented in terms of those.

A nice feature of Haskell is that if you ever encounter a typeclass with many methods, and you’re not sure what’s the minimum you need to implement to satisfy that interface, you can always do:

instance Foldable Tree where

This will warn:

Warning: No explicit implementation for
  either ‘foldMap’ or ‘foldr’
In the instance declaration for ‘Foldable Tree’

Even though only one of the two is necessary, we’ll implement both foldMap and foldr below for our Tree for the sake of illustration:

import Data.Monoid (mempty, (<>))

instance Foldable Tree where
    foldr _ z Empty = z
    foldr f z (Leaf a) = f z a
    foldr f z (Node l x r) = foldr f (f z (foldr z l)) r

    foldMap _ Empty = mempty
    foldMap f (Leaf a) = f a
    foldMap f (Node l x r) = (foldMap f l) <> (f x) <> (foldMap f r)

foldr is clever. The zero value needed to start the operation is passed in. However, what do we do when we’re folding over subtrees? Remembering that f has type (a -> b -> b), the trick is to use one of the subtrees, along with the internal node data, as part of what we’re passing to one of the folds. This way, we can recurse into both subtrees and merge the values together.

foldMap explicitly requires a Monoid instance on the values it generates. Monoid gives us access to a zero value through mempty and a means to merge values through mappend or (<>). This makes the implementation of foldMap a little easier to read than that of foldr, especially in the Node case.

Here’s folding for our Tree:

> import qualified Data.Foldable as F
> F.foldr (+) 0 $ Node (Leaf 1) 2 (Leaf 3)
6
> F.foldr (*) 1 $ Node (Leaf 2) 3 (Leaf 4)
24

Traversable

Now, we arrive at Traversable. Traversable generalizes several notions related to effectful iteration over a structure. For example, the NICTA course asks one to provide an implementation for a traversal that converts a [Maybe a] -> Maybe [a]. A traversal we’ll look at later is one of the form (Integer -> IO ()) -> Tree Integer -> IO (Tree ()).

Let’s take a look at the interface:

class (Functor t, Foldable t) => Traversable (t :: * -> *) where
  traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
  sequenceA :: Applicative f => t (f a) -> f (t a)
  mapM :: Monad m => (a -> m b) -> t a -> m (t b)
  sequence :: Monad m => t (m a) -> m (t a)

For this one, the required function is one of traverse or sequenceA. This is the case because all Monads are Applicatives, as further explained in the AMP. What’s more curious is that sequenceA can be implemented in terms of traverse and vice-versa.

Take note in the implementation below that we access to the functions from Functor, Foldable, and Applicative. Briefly, I reproduce the Applicative interface below:

class Functor f => Applicative (f :: * -> *) where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b
  (*>) :: f a -> f b -> f b
  (<*) :: f a -> f b -> f a

Key to the implementation below is the use of Applicative’s pure and apply (<*>), and Functor’s fmap (<$>). I’ll only cover the implementation of traverse:

instance Traversable Tree where
   traverse _ Empty = pure Empty
   traverse f (Leaf a) = Leaf <$> f a
   traverse f (Node l x r) = Node <$> traverse f l <*> f x <*> traverse f r

Aside from the use of Applicatives, this isn’t too different than our implementation of fmap above for Tree. The flow is similar. The primary difference is that traverse generalizes to effectful functions over fmap.

Here’s Traversable in use:

> import Data.Traversable (traverse)
> traverse print $ Node (Leaf 1) 2 (Leaf 3)
1
2
3

Deriving

What if I told you that we didn’t have to write any of this ourselves? Let me show you a little bit of compiler magic that GHC provides:

{-# LANGUAGE DerivingFunctor #-}
{-# LANGUAGE DerivingFoldable #-}
{-# LANGUAGE DerivingTraversable #-}
data Tree' a
  = Empty'
  | Leaf' a
  | Node' (Tree' a) a (Tree' a)
  deriving (Show, Functor, Foldable, Traversable)

Here’s all our examples from earlier:

> fmap (+1) $ Node' (Leaf' 1) 2 (Leaf' 3)
Node' (Leaf' 2) 3 (Leaf' 4)
> import qualified Data.Foldable as F
> F.foldr (+) 0 $ Node' (Leaf' 1) 2 (Leaf' 3)
6
> F.foldr (*) 1 $ Node' (Leaf' 2) 3 (Leaf' 4)
24
> import Data.Traversable (traverse)
> traverse print $ Node' (Leaf' 1) 2 (Leaf' 3)
1
2
3

How does this work? What are the limitations of deriving Functor, Foldable, and Traversable? Are there any costs?

I’ll save the details for another time. For now, I defer to the GHC manual page on deriving and attribute the ability to derive these constructs (uniquely?) as a consequence of parametricity and category theory.

Closing Notes

It’s terribly fun to work with recursive data in languages that have support for modeling hierarchical structures at the type level. The obvious solution involving pattern matching off of all branches is always a good start towards a correct, complete implementation.