Recursion Schemes

This is a simple and practical introduction to recursion schemes in functional programming.

Recursion schemes are higher order functions (HOF) that can be used as building blocks for composing recursive algorithms. Recursion schemes separates the recursive part of a recursive algorithm from the actual calculation the same way the functions like Array.map separate the the iteration part from the calculation that should be done on each element.

This article will be guiding you through a thought process to understanding recursion schemes by inventing them yourself.

An Elixir Livebook version of this same introduction is available here.

Recap: Higher Order Functions — `for` vs `map`

Before we start lets revisit a well known higher order function: map. Originally rooted in functional programming, nowadays it is commonly used in imperative and object oriented programming languages as well. It allows to express an element-wise operation over some container object, like over a list or array of elements.

Take a look at the following two example functions. The first one iterates over an array to multiply each element by two, collecting the results in a new array.

Example: doubling each element of an array via explicit iteration

The next one achieves the same result but splits the task of iteration and the task of element-wise calculation into two separate functions. The function doing the interation and collecting the resulting values is called arrayMap, or typically just map.

Example: doubling each element of an array via the higher order map function

There are many advantages of separating the iteration from the element-wise calculation. The most obvious advantage is that the iteration has only to be defined once and can be reused for various different operations:

Example: Various elementwise transformations of an array

Another advantage is that the intentions and invariants of the task are expressed more clearly and explcitly.Applying the map function will not change the number of elements in the array and it will not change the order of the elements.

A third advantage of this conceptual separation is to abstract away the specific iteration stragegy, like the specific usage of a for loop (vs a while loop or some other iteration logic). It allowes for later changes and optimizations. For example the internal iteration order could be reversed or even parallelized without effecting the observable result:

Example: doubling each element of an array via the higher order map function

The fourth advantage is that the same concept of applying an element-wise transformation can be implemented for other containers of elements. For example for a binary tree as the one below:

Example: a simple binary tree structure, containing values in its leef

A corresponding map function could be implemented as follows:

Example: a simple implementation of mapTree

Then we coule use the mapTree function in the same way (except for the name) as we did with the mapArray function before to transform each element in the tree:

Example: a simple implementation of mapTree

Now we have seen that separating the stepping from the work to do at each step is a powerful tool when doing work with each element in a data structure.

Recursive data structures

With the binary tree in the example above we have seen that the corresponding map function allowes us to work with a recursive data structure without re-implementing the recursive walking part our self each time.

But by the nature of the map function we were constraint to modify each value of the tree separately. At each step in the callback function we only have access to one element of the tree, not to its parents or children or other parts of the tree. What if we want to do actual tree-related calculations but still package the walking the tree part into a neatly reusable function?

Lets take a look at a few typical cases to see if we can recognize some pattern. Say we want to determine the height of the tree, count the leafs of the tree, count the total number of nodes, find the minimum value inside the tree and find the maximum value inside the tree. We also want to serialize our tree into a string, for better readability

Lets first implement one function for each of our goals by hand. Afterwards we will try to find similaries in the recursion structure and come up with a general reusable solution.

For the maximum tree height, at each node we calculate the height of both children and take their maximum:

Example: tree height

For the total number of leafs the result is 1 if the node itself is a leaf. For inner nodes we sum the number of leafs of the left child with the number of leafs of the right child:

Example: tree leaf count

For the total node count the do the same as for counting the leafs but also add +1 in case of the inner node, to count this node as well.

Example: tree leaf count

For the minimum value stored in a tree (our tree stores values only in its leafs), we take the minimum of either the left or the right child tree. For leaf nodes we just take their value.

Example: tree minimum value

For the maximum value stored in a tree (our tree stores values only in its leafs), we do the same as for the minimum value, but taking the maximum of two children:

Example: tree maxiumum value

For turning the Tree into a string the concat the stringified left child to the stringified right child. For leaf nodes we turn the value stored in the node into a string. We also append some parenthisis to reflect the tree structure in the string.

Example: tree to string

All the implementations above look very similar. All of them check if the current node we are looking at is a leaf (i.e. has a value property). If so, a base case result is returned, either a constant or a some other value based only on the value stored in the leaf. If on the other hand the current node is an inner node, two recursive function calls are invoked (one for the left and one for the right branch of the tree) and then their result is combined in a speicific way.

This pattern of similarity between these function is what we are after. Such a recognizable pattern in recursive algorithms is what is called a recursion scheme. This speicific pattern that we just recognized and described is one specific recursion scheme that later get a specific name.

Lets try to extract this pattern of recursive function calls into a common function, similar to our map function. We take one of the function definitions above and try to move all the specific calculations into functions parameters to be passed in later:

Example: generic binary tree reduction

Instead of hardcoding what to do with the leaf nodes and how exactly the recursive results for the two branches are to be combined , we accept two callback functions: handleLeaf is called with the value stored in each leaf (similar to fn in our treeMap function before) and handleBranch is called with the two results from both of the recursive calls.

Now be can express all of our different tree-based calculations from above in terms of our new binaryTreeReduce function:

Example: application of binaryTreeReduce

That worked great! We have successfully abstracted the recursion into a single binaryTreeReduce function. Are we done yet? Have we mastered the recursion scheme? Almost, but we are not quite there yet.

Other recursive data structures

The binaryTreeReduce function does work for our binary tree but what if we have other recursive data structures we want to work with? For example linked lists, trees with more or less than exactly two branches or trees with values stored on the inner nodes as well?

It would be nice if we could come up with a single recursiveReduce function that works the same for all of these structures, the same way that the interface of a map(fn, collection) functions works for both lists, arrays, trees and other collections (more preciseley functors)

Lets write down some of the recursive data structures we want to work with and start from there:

Trees with values store in both leafs and inner nodes:

Example: binary tree with value on inner nodes

Singly-Linked lists

Example: binary tree with value on inner nodes

Binary Trees with empty leaf nodes:

Example: binary tree with empty nodes

Our next goal is again to recognize a pattern between all these structures that can then be generalized. Lets try to implement the function to find the minimum element of each of these three structures. Afterwards we try to see a common pattern in each of the three implementations:

Example: finding the minimum of a binary tree with inner values

Example: finding the minimum of a linked list

Example: finding the minimum of a binary tree with maybe empty nodes

We can observe that each of theses functions differ in the number of branches, depending on the structure of the recursive data structure. For the linked list we need only two branches, one for the nil case at the end of the list and one for the case of list having a head. For the binary tree with either an inner node with left/right children or an empty node or a leaf node with a single value, the recursive function needs to handle each of the three conditions.

Also the number of arguments of the functions vary. For the tree that contains at least one leaf node with a value we do not need a fallback value. But for the other two structures we need a fallback value to return in case the structure itself does not contain any value that could be used as result.

How could we possibly unify all three of these functions into a common interface? Lets take it slowly and first implement a generic function for each of the structures individually again, as we did with binaryTreeReduce for the simple binary tree above:

For processing the binary tree we accept two callbacks: one for handling the leaf and one for handling the inner node

Example: generic reduction of a binary tree with inner values

For processing the linked list we also accept two callbacks: one for handling the empty case and one for handling a head. Notice that for handling the fallback case of an empty list we will provide it as function returning a constant instead of a constant itself. This is for consistency reasons.

Example: generic reduction of a linked list

For the tree with maybe empty nodes we had to consider three cases and accordingly accept three different callback functions to handle them. Same as with the linked list we model the fallback value for the empty case as callback function as well.

Example: generic reduction of a binary tree with maybe empty nodes

Now we can see even more cleary that the number of needed handle-callbacks for each reduction vary and that even the number of argument each of these handle-callback takes, varies. For example handleEmpty takes now arguments but handleInner for one of the trees may take 3.

At a first glance it does not seem possible to unify the arguments of these different reudction function into a single generic function that works for both lists, trees and other recursive data strutures. Now comes the brilliant idea:

But what if we could merge all the callbacks into a single callback function and leave it for this function to handle each possible case correctly? Lets try:

Example: generic reduction of a binary tree with inner values

Example: generic reduction of a linked list

Example: generic reduction of a binary tree with maybe empty nodes

Did you notice what we did here? We merged all the callback functions into a single handleCases callback that takes only a single value as argument. What what exactly is this argument that we pass to handleCases? It is either an object containing the results of the recusive function calls, or the value store in the leafs of the data structure, in case we hit a base case, or a special marker object in case we are in an empty case.

But look again: the value we pass to handleCases exactly reflects the structure of the node we are currently processing.

For example when processing an inner node of a binary tree with inner values ({value, left, right}), we call handleCases({ value, left, right, }) with value containing the value in the current node, but left and right not containing the sub trees of the original tree, but instead the already calculated results of the two recursive calls.

What we pass to the handleCases looks structurally exactly like a node of the the tree but, instead of actually containing the recursive structure of the tree, it is just a flat container for the the results of the recursive function calls.

For each of our data structures we can think of having actually two different but matching storage structures: One that stores the actual tree and one that stores the result of doing a single recursive calculation step at one tree node. Both look like {value, left, right} | {value} on the surface but differ in their content: The one storing the tree is actually {value: Number, left: Tree, right: Tree} | {value: number} and the other, for storing a caculation result, is: {value: Number, left: Number, right: Number} | {value: number}.

Example: comparison between actual recursive tree nodes and container objects to hold the recursion results.

We kind of came up with a helper structure for passing results around. So by mirroring the structure of the recursive tree nodes we can capture all the results of our recursive function calls an pass it to to our handler function as a single value. The handler function can then inspect in which case of the structure the recursion is currently working in.

This allows use to define generic recursive functions that work for any recursive data structure as long as for each data structure we carry the corresponding result container around.

But there is more! Take another look at the each of the handleCases(...) function calls and prepare for one more unifying abstraction. For example this one:

Example: revisit the handleCases() function call

Wow! All the cases in binaryTreeWithEmptyNodesReduce (and the other ...Reduce functions respectively) are following exactly the same structure. The just copy the node, replacing the recursive fields and apply the handleCases function to the now flat copy of the node.

Could we extract the make a flat copy of the node and replace the recursive parts into a separate general function? Yes we can! Take a look:

Example: splitting binaryTreeWithEmptyNodesReduce into two functions: the recursion and the node copying

Now look at that! The binaryTreeRecursionReplacer looks almost like our map function from the beginning, but instead of applying the provided function (replacer) to the value stored in a node, the function is applied to the recursive fields (left and right). So we can think of it as an alternative completly different way of mapping over the tree node.

Also look at the binaryTreeReduce function now. It is really short now! The handling of all the cases has been moved into the binaryTreeRecursionReplacer function. What is left in binaryTreeReduce is the pure recursion. This is our codified recursion scheme!

This specific recursion scheme is called catamorphism because cata is greek for collapsing and what all our functions had in common is that they collapsed the trees or lists they were given into a single value. Yes, Array.reduce is also a catamorphism.

Lets put everything back together:

Example: splitting binaryTreeWithEmptyNodesReduce into two functions: the recursion and the node copying

To be continued

We are not even done yet. The next step is to unify the data structure that stores the tree ({value, left, right}|{value}) and the structure the stores the recursion result into a single general data structure.

References