Recursion


There is a odd thing about this tree structure. It refers to itself. Each of its children are themselves trees again, and the children of those children are trees yet again. Just like our languages and re-write rules, data in this structure contains repeated substructures that resemble their parents.

recursion

Recursion • Dangerous in a fire.

This pattern of repeated substructures could go on and on. Clearly if we want a function which can work on all possible trees we can’t look just a couple of nodes down, we have to define it to work on trees of any depth.

Luckily we can do this, by exploiting the nature of how these substructures repeat and using a technique called recursion.

Put simply a recursive function is one that calls itself as some part of its calculation.

It sounds weird for a function to be defined in terms of itself. But consider that functions can give different outputs when supplied with different inputs. If we give changed, or different inputs to a recursive call to the same function, and provide a way for this function to not call itself again under certain conditions, we can be more confident this recursive function is doing something useful.

As an example we can write a recursive function which will count the number of nodes in our tree structure.

To begin we work out how it will act in the most simple case - if the input tree has no children. In this case we know the result is simply one. Now we can go on to define the more complex case - if the tree has one or more children. In this case the result will be one (for the node itself), plus the number of nodes in all of those children.

But how do we find the number of nodes in all of the children? Well we can use the function we are in the process of defining! Yeah, Recursion.

In C we might write it something like this.

  1. int number_of_nodes(mpc_ast_t* t) {
  2. if (t->children_num == 0) { return 1; }
  3. if (t->children_num >= 1) {
  4. int total = 1;
  5. for (int i = 0; i < t->children_num; i++) {
  6. total = total + number_of_nodes(t->children[i]);
  7. }
  8. return total;
  9. }
  10. return 0;
  11. }

Recursive functions are weird because they require an odd leap of faith. First we have to assume we have a function which does something correctly already, and then we have to go about using this function, to write the initial function we assumed we had!

Like most things, recursive functions almost always end up following a similar pattern. First a base case is defined. This is the case that ends the recursion, such as t->children_num == 0 in our previous example. After this the recursive case is defined, such as t->children_num >= 1 in our previous example, which breaks down a computation into smaller parts, and calls itself recursively to compute those parts, before combining them together.

Recursive functions can take some thought, so pause now and ensure you understand them before continuing onto other chapters because we’ll be making good use of them in the rest of the book. If you are still uncertain, you can attempt some of the bonus marks for this chapter.