Recursion Schemes
This is a simple and practical introduction to recursion schemes in functional programming.
Recursion schemes are higher order functions (HOF) that can be used as building blocks for composing recursive
algorithms. Recursion schemes separates the recursive part of a
recursive algorithm from the actual calculation the same way the
functions like Array.map separate the the
iteration part from the calculation that should be done
on each element.
This article will be guiding you through a thought process to understanding recursion schemes by inventing them yourself.
An Elixir Livebook version of this same introduction is available here.
Recap: Higher Order Functions — for vs
map
Before we start lets revisit a well known higher order function:
map. Originally rooted in functional programming,
nowadays it is commonly used in imperative and object oriented
programming languages as well. It allows to express an
element-wise operation over some container object, like over a
list or array of elements.
Take a look at the following two example functions. The first one iterates over an array to multiply each element by two, collecting the results in a new array.
The next one achieves the same result but splits the task of
iteration and the task of element-wise calculation into two
separate functions. The function doing the interation and
collecting the resulting values is called arrayMap,
or typically just map.
map function
There are many advantages of separating the iteration from the element-wise calculation. The most obvious advantage is that the iteration has only to be defined once and can be reused for various different operations:
Another advantage is that the intentions and invariants of the
task are expressed more clearly and explcitly.Applying the
map function will not change the number of elements
in the array and it will not change the order of the elements.
A third advantage of this conceptual separation is to abstract
away the specific iteration stragegy, like the specific usage of
a
for loop (vs a while loop or some
other iteration logic). It allowes for later changes and
optimizations. For example the internal iteration order could be
reversed or even parallelized without effecting the observable
result:
map function
The fourth advantage is that the same concept of applying an element-wise transformation can be implemented for other containers of elements. For example for a binary tree as the one below:
A corresponding map function could be implemented as follows:
mapTree
Then we coule use the mapTree function in the same
way (except for the name) as we did with the
mapArray function before to transform each element
in the tree:
mapTree
Now we have seen that separating the stepping from the work to do at each step is a powerful tool when doing work with each element in a data structure.
Recursive data structures
With the binary tree in the example above we have seen that the
corresponding map function allowes us to work with
a recursive data structure without re-implementing the recursive
walking part our self each time.
But by the nature of the map function we were
constraint to modify each value of the tree separately. At each
step in the callback function we only have access to one element
of the tree, not to its parents or children or other parts of
the tree. What if we want to do actual tree-related calculations
but still package the walking the tree part into a
neatly reusable function?
Lets take a look at a few typical cases to see if we can recognize some pattern. Say we want to determine the height of the tree, count the leafs of the tree, count the total number of nodes, find the minimum value inside the tree and find the maximum value inside the tree. We also want to serialize our tree into a string, for better readability
Lets first implement one function for each of our goals by hand. Afterwards we will try to find similaries in the recursion structure and come up with a general reusable solution.
For the maximum tree height, at each node we calculate the height of both children and take their maximum:
For the total number of leafs the result is 1 if
the node itself is a leaf. For inner nodes we sum the number of
leafs of the left child with the number of leafs of the right
child:
For the total node count the do the same as for counting the
leafs but also add +1 in case of the inner node, to
count this node as well.
For the minimum value stored in a tree (our tree stores values only in its leafs), we take the minimum of either the left or the right child tree. For leaf nodes we just take their value.
For the maximum value stored in a tree (our tree stores values only in its leafs), we do the same as for the minimum value, but taking the maximum of two children:
For turning the Tree into a string the concat the stringified left child to the stringified right child. For leaf nodes we turn the value stored in the node into a string. We also append some parenthisis to reflect the tree structure in the string.
All the implementations above look very similar. All of them
check if the current node we are looking at is a leaf (i.e. has
a value property). If so, a base case result is
returned, either a constant or a some other value based only on
the value stored in the leaf. If on the other hand the current
node is an inner node, two recursive function calls are invoked
(one for the left and one for the right branch of the tree) and
then their result is combined in a speicific way.
This pattern of similarity between these function is what we are after. Such a recognizable pattern in recursive algorithms is what is called a recursion scheme. This speicific pattern that we just recognized and described is one specific recursion scheme that later get a specific name.
Lets try to extract this pattern of recursive function calls
into a common function, similar to our
map function. We take one of the function
definitions above and try to move all the specific calculations
into functions parameters to be passed in later:
Instead of hardcoding
what to do with the leaf nodes and
how exactly the recursive results for the two branches are
to be combined
, we accept two callback functions: handleLeaf is
called with the value stored in each leaf (similar to
fn in our treeMap function before) and
handleBranch is called with the two results from
both of the recursive calls.
Now be can express all of our different tree-based calculations
from above in terms of our new
binaryTreeReduce function:
binaryTreeReduce
That worked great! We have successfully
abstracted the recursion into a single
binaryTreeReduce function. Are we done yet? Have we
mastered the recursion scheme? Almost, but we are not quite
there yet.
Other recursive data structures
The binaryTreeReduce function does work for our
binary tree but what if we have other recursive data
structures we want to work with? For example linked lists, trees
with more or less than exactly two branches or trees with values
stored on the inner nodes as well?
It would be nice if we could come up with a single
recursiveReduce function that works the same for
all of these structures, the same way that the interface of a
map(fn, collection) functions works for both lists,
arrays, trees and other collections (more preciseley
functors)
Lets write down some of the recursive data structures we want to work with and start from there:
Trees with values store in both leafs and inner nodes:
Singly-Linked lists
Binary Trees with empty leaf nodes:
Our next goal is again to recognize a pattern between all these structures that can then be generalized. Lets try to implement the function to find the minimum element of each of these three structures. Afterwards we try to see a common pattern in each of the three implementations:
We can observe that each of theses functions differ in the number of branches, depending on the structure of the recursive data structure. For the linked list we need only two branches, one for the nil case at the end of the list and one for the case of list having a head. For the binary tree with either an inner node with left/right children or an empty node or a leaf node with a single value, the recursive function needs to handle each of the three conditions.
Also the number of arguments of the functions vary. For the tree that contains at least one leaf node with a value we do not need a fallback value. But for the other two structures we need a fallback value to return in case the structure itself does not contain any value that could be used as result.
How could we possibly unify all three of these functions
into a common interface?
Lets take it slowly and first implement a generic function for
each of the structures individually again, as we did with
binaryTreeReduce for the simple binary tree above:
For processing the binary tree we accept two callbacks: one for handling the leaf and one for handling the inner node
For processing the linked list we also accept two callbacks: one for handling the empty case and one for handling a head. Notice that for handling the fallback case of an empty list we will provide it as function returning a constant instead of a constant itself. This is for consistency reasons.
For the tree with maybe empty nodes we had to consider three cases and accordingly accept three different callback functions to handle them. Same as with the linked list we model the fallback value for the empty case as callback function as well.
Now we can see even more cleary that the number of needed
handle-callbacks for each reduction vary and that
even the number of argument each of these
handle-callback takes, varies. For example
handleEmpty takes now arguments but
handleInner for one of the trees may take 3.
At a first glance it does not seem possible to unify the arguments of these different reudction function into a single generic function that works for both lists, trees and other recursive data strutures. Now comes the brilliant idea:
But what if we could merge all the callbacks into a single callback function and leave it for this function to handle each possible case correctly? Lets try:
Did you notice what we did here? We merged all the callback
functions into a single handleCases callback that
takes only a single value as argument. What what exactly is this
argument that we pass to handleCases? It is either
an object containing the results of the recusive function calls,
or the value store in the leafs of the data structure, in case
we hit a base case, or a special marker object in case we are in
an empty case.
But look again: the value we pass to
handleCases exactly reflects the structure of the
node we are currently processing.
For example when processing an inner node of a binary tree with
inner values ({value, left, right}), we call
handleCases({ value, left, right, }) with
value containing the value in the current node, but
left and right
not containing the sub trees of the original
tree, but instead the already calculated results of the two
recursive calls.
What we pass to the handleCases looks structurally
exactly like a node of the the tree but, instead of actually
containing the recursive structure of the tree, it is just a
flat container for the the results of the recursive function
calls.
For each of our data structures we can think of having actually
two different but matching storage structures: One that stores
the actual tree and one that stores the result of doing a single
recursive calculation step at one tree node. Both look like
{value, left, right} | {value} on the surface but
differ in their content: The one storing the tree is actually
{value: Number, left: Tree, right: Tree} | {value:
number}
and the other, for storing a caculation result, is:
{value: Number, left: Number, right: Number} | {value:
number}.
We kind of came up with a helper structure for passing results around. So by mirroring the structure of the recursive tree nodes we can capture all the results of our recursive function calls an pass it to to our handler function as a single value. The handler function can then inspect in which case of the structure the recursion is currently working in.
This allows use to define generic recursive functions that work for any recursive data structure as long as for each data structure we carry the corresponding result container around.
But there is more! Take another look at the each of the
handleCases(...) function calls and prepare for one
more unifying abstraction. For example this one:
handleCases() function call
Wow! All the cases in
binaryTreeWithEmptyNodesReduce (and the other
...Reduce functions respectively) are following
exactly the same structure. The just copy the node, replacing
the recursive fields and apply the
handleCases function to the now flat copy of the
node.
Could we extract the make a flat copy of the node and replace the recursive parts into a separate general function? Yes we can! Take a look:
binaryTreeWithEmptyNodesReduce into two
functions: the recursion and the node copying
Now look at that! The
binaryTreeRecursionReplacer looks almost like our
map function from the beginning, but
instead of applying the provided function
(replacer) to the value stored in a node, the
function is applied to the recursive fields (left
and right). So we can think of it as an alternative
completly different way of mapping over the tree
node.
Also look at the binaryTreeReduce function now. It
is really short now! The handling of all the cases has been
moved into the
binaryTreeRecursionReplacer function. What is left
in binaryTreeReduce is the pure recursion.
This is our codified recursion scheme!
This specific recursion scheme is called
catamorphism because cata is greek for
collapsing and what all our functions had in common is
that they collapsed the trees or lists they were given into a
single value. Yes, Array.reduce is also a
catamorphism.
Lets put everything back together:
binaryTreeWithEmptyNodesReduce into two
functions: the recursion and the node copying
To be continued
We are not even done yet. The next step is to unify the data
structure that stores the tree ({value, left, right}|{value}) and the structure the stores the recursion result into a
single general data structure.
References
TBD