binary search tree with key values $ a_1 < \dots < a_k$. How do I choose $j$ to still get a tree with minimal height?

76 Views Asked by At

So at the beginning I have an empty binary search tree. Moreover I have key values $a_1 < \dots < a_k$. How can we choose $j$ so that after the first insert of an element with key value $a_j$ a tree of minimal height is still possible by further insertions.

My thoughts:

My first idea was to choose $j=1$ ( later $j=k$), which is obviously false. I know that we somehow have to use the fact that we have $a_1 < \dots < a_k$. Maybe we could choose $j = \frac{k}{2}$ or something like that. But this is just an idea. I don't know how to argue for that.

1

There are 1 best solutions below

0
On BEST ANSWER

The largest number of nodes a binary tree (not necessarily BST) of height $H \in \mathbb{N}$ (a single vertex tree has height $1$) can accommodate is $$ 1 + 2 + ...+2^{H-1} = 2^H - 1, $$ which happens precisely when all levels are full. Thus, for a BST with $n$ nodes we need at least height $H \in \mathbb{N}$ such satisfying $n \leq 2^H - 1$, which implies $$H \geq \lceil \log_2(1+n) \rceil \tag{1}$$ where $\lceil \cdot \rceil$ stands for the ceiling function. Since BST is a binary tree, the best minimal height $H$ one can get on $n$ keys must satisfy $(1)$.

Let us now see that $(1)$ is attainable given a proper structure of the nodes (keys). First, notice that the particular values of $a_1<...<a_n$ of the keys do not matter, we may WLOG work with keys $1<2<...<n$ to simplify the notation. Next, observe that for $n\in \mathbb{N}$ in the range $$2^{k-1} \leq n < 2^k \tag{2} $$ the height estimate $(1)$ gives $\lceil \log_2(1+n) \rceil = k $. Hence it is enough to show that for $n$ keys where $n$ satisfies $(2)$ we can build a BST of height $k$ on this keys. This will be the optimal configuration, i.e. one with the minimal height (there can be more than one trees on $n$ keys having minimal height depending on $n$). To show this consider the worst case when $n = 2^k - 1$, we show by induction that the optimal height must be $k$. Indeed, the case $k = 1$ is trivial. For $k>1$ choose $2^{k-1}$ as the root of the BST and put everything from $\{1,2,...,2^{k-1} - 1 \}$ into the left subtree and the rest $\{2^{k-1} + 1,...,2^k - 1\}$ to the right subtree. This will preserve the BST condition for the root node. Notice that both left and right subtrees are binary search trees with $2^{k-1} - 1$ nodes, and hence by induction both have height $k-1$ in the optimal configuration. This gives height $k$ for the original set of keys and completes the induction step.

Now, for the set of incomplete keys, i.e. when $2^{k-1} \leq n < 2^k $ but not necessarily $n = 2^k - 1$, we may add "dummy" keys to make $n = 2^k-1$, then construct the optimal tree for this enlarged set of keys and then drop the nodes with dummy keys.

For example, the optimal placement for keys $\{1,2,...,7\}$ would be $$ 4\to [2\to[1,3], 6\to [5,7]], $$ with $4$ as the root.