If the maximum depth of a rooted ordered unlabelled binary tree is exactly $d$ what are the expected number of elements in the tree?

67 Views Asked by At

My guess is roughly around $2^{d-1}$, but I am not sure how to show it.

Simply put, let $S$ be the set of all rooted ordered unlabelled binary trees, with maximum depth $d$. What is the expected number of nodes/elements in a tree which is chosen uniformly at random from this set $S$.

1

There are 1 best solutions below

3
On

It's easier to first of all deal with trees of maximum depth up to $d$ and then subtract to get answers for trees of maximum depth exactly $d$.

Let's set up a recurrence relation for the number $a_d$ of rooted ordered unlabeled binary trees of maximum depth up to $d$. It simplifies things to include the empty tree. Then a tree of depth up to $d$ is either the empty tree, or it's a root with each child a tree of depth up to $d-1$. Thus we have the recurrence

$$ a_d = a_{d-1}^2+1\;, $$

with intial value $a_0=1$. This is OEIS A003095. Surprisingly little seems to be known about it. It leads us to OEIS A001699, which is $a_d-a_{d-1}$, and thus the number of rooted ordered unlabeled binary trees of maximum depth exactly $d$. The OEIS entry says that this asymptotically grows as $c^{2^d}$ with an apparently only numerically known constant $c\approx1.5028368$.

Now to get the expected number of elements, we need the sum $s_d$ of the numbers of elements over all trees – again, first the sum for all trees of maximum depth up to $d$ and then, by subtraction, the sum for all tress of maximum depth exactly $d$.

The empty tree contributes no elements to the sum. For the $a_{d-1}^2$ trees with a root, the root contributes $a_{d-1}^2$ to the sum. The sum of the elements of all possible left children of the root is $s_{d-1}$, and they each appear once for each of the $a_{d-1}$ possible right children; and the same vice versa. Thus, the two children contribute a total of $2a_{d-1}s_{d-1}$ to the sum, so overall we get the recurrence

$$ s_d=2a_{d-1}s_{d-1}+a_{d-1}^2\;. $$

Dividing through by $a_{d-1}^2$ and using the fact that for large $d$ we have $a_d=a_{d-1}^2+1\approx a_{d-1}^2$, we arrive at

$$ \frac{s_d}{a_d}\approx2\frac{s_{d-1}}{a_{d-1}}+1\;. $$

The fractions are the mean numbers of elements for trees of maximum depth up to $d$ and $d-1$, respectively. Since $s_d\gg s_{s-1}$ and $a_d\gg a_{d-1}$ for large $d$, this equation also holds approximately when we replace all values for trees of maximum depth up to $d$ by the corresponding values for tree of maximum depth exactly $d$:

$$ \frac{s_d-s_{d-1}}{a_d-a_{d-1}}\approx2\frac{s_{d-1}-s_{d-2}}{a_{d-1}-a_{d-2}}+1\;, $$

and thus

$$ b_d\approx2b_{d-1}+1 $$

for the desired mean number $b_d$ of elements in a tree of maximum depth exactly $d$.

This recurrence has the general solution

$$ b_d\approx\lambda\cdot2^d-1\;. $$

As with the constant $c$ above, since the equation only holds asymptotically for large $d$, we can't determine $\lambda$ exactly from the initial conditions; but we can determine it rather precisely by calculating the first few values of the sequences.

Here's Java code that does this. Here are the results up to $d=10$. (I'm not including $a_n$ and $s_n$ themselves, as these already have almost $200$ digits at $d=10$.)

\begin{array}{c|c|c|c|c|c|c} d&\frac{s_d}{a_d}&\frac{s_d-s_{d-1}}{a_d-a_{d-1}}&2^{-d}\left(\frac{s_d-s_{d-1}}{a_d-a_{d-1}}+1\right)\\\hline 1&0.5&1.0&1.0\\ 2&1.6&2.3333333333333333&0.8333333333333333\\ 3&4.038461538461538&4.619047619047619&0.7023809523809523\\ 4&9.063515509601181&9.2642089093702&0.6415130568356375\\ 5&19.126989287194817&19.141876050195236&0.6294336265686011\\ 6&39.25397857420277&39.254022488049124&0.6289691013757676\\ 7&79.50795714840554&79.50795714859716&0.6289684152234153\\ 8&160.01591429681108&160.01591429681108&0.6289684152219183\\ 9&321.0318285936221&321.0318285936221&0.6289684152219182\\ 10&643.0636571872442&643.0636571872442&0.6289684152219182\\ \end{array}

The calculation for $\lambda$ in the rightmost column converges very nicely. So the expected number of elements of a rooted ordered unlabeled binary tree of maximum depth exactly $d$ is approximately

$$ \lambda\cdot2^d-1 $$

with $\lambda\approx0.628968415221918$.

Thus your guess was already pretty good; you only underestimated the result by a factor of about $\frac54$.