Generating function for Catalan numbers using the "Analytic combinatorics" approach.

155 Views Asked by At

So, I first learnt about how to go from the recurence relation of the Catalan numbers to their generating function from exercise 12-4 of the book, Introduction to Algorithms, third edition by Cormen et.al. Here, they define the recurrence relation:

$$b_n=\sum\limits_{k=0}^{n-1}b_k b_{n-k-1}\tag{1}$$

And after some non-trivial amount of algebra, we conclude that the generating function $B(z)=\sum\limits_{n=0}^\infty b_nz^n$ (edit here: changed to $z^n$) satisfies:

$$B(z)=1+zB(z)^2\tag{2}$$ finally yielding the closed-form:

$$B(z)= \frac{1-\sqrt{1-4z}}{2z}\tag{3}$$

Cormen et.al. do this in the context of counting the number of binary trees with $n$ nodes. In figure 12.1, it seems apparent to me that they define a Binary tree as a tree where each node has 0, 1 or 2 children and one node has the special status of the "root".

Now, I was going through the course by Robert Sedgewick on Analytic Combinatorics (Coursera; it's a free course). It follows the textbook of the same name (0th edition). Here, they reach the same generating function in equation (2) in just one step. On page 6, they do this for trees "with $n$ binary branching nodes hence $n+1$ external nodes". They then write the symbolic equation:

$$C=e+(C,n,C)\tag{4}$$

where $e$ denotes the external nodes and $n$ the internal nodes. Defining the "size" of $e$ to be $1$ they get directly the the generating function in (2):

$$C(z)=1+zC(z)^2$$

Then, on page 62, they define the concept of "unary-binary" trees that can have $0$, $1$ or $2$ child nodes and come up with a different generating function for them.

Now for my questions:

  1. I thought the tree-structures Cormen et.al. were considering were "unary-binary" trees as seemed evident from figure 12.1. What is the difference between these trees (which Sedgewick describes as trees with $n$ internal nodes) and unary binary trees?
  2. Why do we need this concept of external nodes to construct equation (4)? What if I was doing this from scratch and never thought of external nodes? Is there a corresponding symbolic equation that can lead to the generating function if I never thought of external nodes (like Cormen et.al. didn't consider them)?
2

There are 2 best solutions below

1
On

The two variants of tree are equivalent in that if you have a Sedgewick tree and remove all external nodes you get a Cormen tree. Conversely, if you have a non-empty Cormen tree, then each node that does not have two children has the missing children added to it, while the empty Cormen tree acquires a root node.

In equation $(4)$ the $e$ represents the tree with only one node which is an external node. The $n$ represents the root node which is an internal node in case there is more than one node.

The interpretation for Cormen trees of the same equation would be that $e$ represents the empty tree with $0$ nodes and $n$ is again the root node.

In both kinds of binary trees, the children nodes are distinguished as being left or right. Thus, in the Cormen tree with a root node and one child, there is a left child version and a right child version. In the case of "unary-binary" trees, the unary child is only one kind.

8
On

For your second question, the $1$ in $(2)$ accounts for the value of $C_0$; this is necessary, since the rest of $(2)$ has a $0$ constant term. Similarly, when we count full binary trees with $n$ internal nodes, we have to account for the unique one with no internal nodes, since for this class the size is the number of internal nodes. Thus, we necessarily get $\mathcal{B}=\{\epsilon\}+\mathcal{B}\times\mathcal{Z}\times\mathcal{B}$ and $B(z)=1+zB(z)^2$.

In both cases we have to account for the size $0$ case separately: in the algebraic derivation from the recurrence it accounts for the initial condition, and in the symbolic approach it accounts it essentially does the same thing, since the tree of size $0$ is the only one that isn’t built up via the product $\mathcal{B}\times\mathcal{Z}\times\mathcal{B}$ that describes hanging two binary trees from a ‘root’ node.