I recently started studying algorithms on my own using nptel lectures in youtube. Kindly clarify how the professor wrote below equation for expected number of probes in unsuccessful search in open addressing.
My attempt for this:-
Kindly clarify.
I recently started studying algorithms on my own using nptel lectures in youtube. Kindly clarify how the professor wrote below equation for expected number of probes in unsuccessful search in open addressing.
My attempt for this:-
Kindly clarify.
Copyright © 2021 JogjaFile Inc.


Your mistake is assuming that every slot is occupied with probability $\alpha$, when in fact that is only true for the first slot: after that, the probability decreases, because the previous slots are known to be occupied.
For the first slot, the probability is $\frac nm$. Given that the first slot is occupied, the probability that the second slot is also occupied is $\frac{n-1}{m-1}$: of the remaining $m-1$ slots, $n-1$ are occupied. Given that the first two slots are occupied, the probability that the third slot is also occupied is $\frac{n-2}{m-2}$, and so on.
We can find the expected number of probes as a sum over indicator variables. Let $X_k=1$ if a $k^{\text{th}}$ probe is necessary (the first $k-1$ slots are occupied) and $X_k=0$ otherwise.. Then we have $$ \mathbb E[X] = \mathbb E[X_1] + \mathbb E[X_2] + \dots + \mathbb E[X_{n+1}]. $$ For the first probe, $\mathbb E[X_1] = 1$. By the probability argument above, $\mathbb E[X_k] = \frac mn \cdot \frac{m-1}{n-1} \cdots \frac{m-k+1}{n-k+1}$. We get $$ \mathbb E[X] = 1 + \frac mn + \frac mn \cdot \frac{m-1}{n-1} + \frac mn \cdot \frac{m-1}{n-1} \cdot \frac{m-2}{n-2} + \dots $$ which is factored in the equation you saw as $$ \mathbb E[X] = 1 + \frac mn \left(1 + \frac{m-1}{n-1}\left(1 + \frac{m-2}{n-2}\left(1 + \dots\right)\right)\right). $$
That's not the best way to find the expected value, however. Another way to write the formula for $\mathbb E[X_k]$ is $\binom{m-k+1}{n-k+1}/\binom mn$ which has its own combinatorial interpretation: if we need a $k^{\text{th}}$ probe, then the first $k-1$ slots we check are occupied, and there are $\binom{m-k+1}{n-k+1}$ ways to occupy the other $m-k+1$ slots in the hash table.
Therefore $$ \mathbb E[X] = \sum_{k=1}^{n+1}\frac{\binom{m-k+1}{n-k+1}}{\binom mn} = \frac1{\binom mn} \sum_{j=0}^n \binom{m-j}{n-j}. $$ The sum simplifies to $\binom{m+1}{n}$ by the hockey-stick identity and so we just get $\binom{m+1}{n}/\binom mn = \frac{m+1}{m-n+1}$ as the expected value.
The lecture you're watching says that this is "almost $\frac1{1-\alpha}$" where $\alpha = \frac nm$ is the load factor. To see this, divide top and bottom by $m$; we get $\frac{1 + 1/m}{1 - n/m + 1/m} = \frac{1 + 1/m}{1 - \alpha + 1/m}$, and if $m$ is large, this is essentially $\frac1{1-\alpha}$.