child nodes of a incomplete binary tree

520 Views Asked by At

I have a binary tree. Like this:

       a
      /  \
     b    c
    / \  / \
   d  e  f  g

So according to this as we have a totalNodes of 7:

a has 6 children
b has 2 children
c has 2 children
d has 0 children
e has 0 children
f has 0 children
g has 0 children

I could get this formula to get the number of children of each node:

$$ level = floor(\log_{2} node) \\ children = \frac{totalNodes + 1}{2^{level}} -2 $$

Saying that a is node 1, b is node 2, ... g is node 7.

This works if totalNodes is $2^{level} - 1$. I mean if the last level of the tree is complete. I would like to find a formula that will cover the cases where this last level is incomplete!

For example, if totalNodes is 6, is because we have something like:

       a
      /  \
     b    c
    / \  / 
   d  e  f  

a has 4 children
b has 2 children
c has 1 children
d has 0 children
e has 0 children
f has 0 children
g has 0 children

Thanks for your help!

1

There are 1 best solutions below

1
On BEST ANSWER

Using the fact that a binary tree of depth $n$ contains a complete binary tree of depth $n-1$, we can split the calculation into the sum of children with level $< n-1$ (since the levels are $0$-indexed) and the number of nodes on the final level of the tree that are within the range.

If we order the nodes as follows:

         1
     /       \
    2         3
  /   \     /   \
 4     5   6     7
/ \   / \ / \   / \

and so on, we can see that the range of nodes on the last level (leftmost has a index of 0 on the last level, rightmost has an index of $2^{n-1}$) of a complete binary tree of $depth$ n that are covered by a node with index $x$ is: $$ (x-2^{\lfloor log_2(x) \rfloor})\cdot2^{n-\lfloor log_2(x) \rfloor - 1} \space to \space (x-2^{\lfloor log_2(x) \rfloor}+1)\cdot2^{n-\lfloor log_2(x) \rfloor - 1} $$

Deriving this will be mentioned later. Now we just need to know the number of nodes that actually exist in this range on the last level (empty areas can be replaced with "phantom nodes"). If the total number of nodes is $k$ with $k \le 2^n-1$, this can be calculated with:

$$ min((k - 2^{n-1} + 1),(x-2^{\lfloor log_2(x) \rfloor})\cdot2^{n-\lfloor log_2(x) \rfloor - 1})\\ - min((k - 2^{n-1} + 1),(x-2^{\lfloor log_2(x) \rfloor})\cdot2^{n-\lfloor log_2(x) \rfloor - 1}) $$

This can be combined with your original formula. Let $l$ be the level of node $x$, $c$ be the number of children of node $x$, $k$ be the number nodes in the tree and $n$ is the depth of the tree (i.e. $k \le 2^n-1$). We obtain the following:

$$ l = \lfloor log_2(x) \rfloor \\ c = \frac{2^{n-1}}{2^l} - 2 + min((k - 2^{n-1} + 1),(x-2^{l}+1)\cdot2^{n-l-1})\\ - min((k - 2^{n-1} + 1),(x-2^{l})\cdot2^{n-l-1}) $$

For example, in the tree shown in your question, the number of children $c$ for node a ($x = 1$) is:

$$ \begin{align*} c & = \frac{2^{3-1}}{2^0} - 2 + min((7 - 2^{3-1}),(1-2^{0}+1)\cdot2^{3-0-1}) - min((7 - 2^{3-1} + 1),(1-2^{0})\cdot2^{3-0-1})\\ & = \frac{4}{1} - 2 + min(4,4) - min(4,0)\\ & = 4 - 2 + 4 - 0\\ & = 6 \end{align*} $$

Proof

To show that the range is correct, we first consider the size of the range. We know that in a complete binary tree of depth $n$, as the level $l$ of a node $x$ increases, the number of nodes on the last layer of the tree with the lowest common ancestor at $x$ decreases. We also know that for such a node $x$, its children are $2x$ and $2x+1$ (unless it is a leaf node). Each of these will also have 2 children till the leaf nodes are reached at depth $n-1$. Thus the number of nodes on the last layer of the tree with node $x$ as a common ancestor is $2^{n-l-1}$.

Notice that by the nature of a binary tree (or trees in general), no node will have 2 parents. Thus for any 2 nodes on the same level of the tree, the ranges of the last layer with the respective nodes as their lowest common ancestor are disjoint. Since all nodes must be covered by a range, the ranges are consecutive.

From here we can derive the left border of the range. Since the ranges are disjoint and consecutive (i.e. no node in between 2 ranges), for a level $l$, there are exactly $2^l$ ranges. We can derive which range a node $x$ at level $l$ covers with $x-2^l$. This is because the leftmost node of a level is always a power of $2$, thus this value will be $0$ for the leftmost node. The rightmost node is exactly $2^l-1$ larger than the leftmost node, thus the index of this node will never exceed $2^{l+1}$. Thus all the $x-2^l$ for each node $x$ with level $l$ is always in the range $[0,2^l-1]$. We know the index of the range on the last level, where we can derive the leftmost index as $(x-2^{l})\cdot2^{n-l-1}$ and the rightmost index as $(x-2^{l}+1)\cdot2^{n-l-1}-1$.

To find the number of nodes in the range on the last layer, we need the number of nodes from the leftmost index $0$ to the rightmost index $(x-2^{l}+1)\cdot2^{n-l-1}-1$. This the smaller of the maximum number of nodes that can be there which is: $$ (x-2^{l}+1)\cdot2^{n-l-1}-1-0+1 = (x-2^{l}+1)\cdot2^{n-l-1} $$

or the number of nodes on the bottom layer itself, which is $k-2^{n-1}+1$, subtracted by the nodes from $0$ to the leftmost index $(x-2^{l})\cdot2^{n-l-1}$ excluding the leftmost index, which is the smaller of:

$$ (x-2^{l})\cdot2^{n-l-1} - 0 + 1 - 1 = (x-2^{l})\cdot2^{n-l-1} $$

or the number of nodes itself, which is $k-2^{n-1}+1$. Thus we derive the formula for the number of nodes on the last layer that are children of $x$ as:

$$ min((k - 2^{n-1} + 1),(x-2^{l}+1)\cdot2^{n-l-1})\\ - min((k - 2^{n-1} + 1),(x-2^{l})\cdot2^{n-l-1}) $$