Probability of collision in a hash function.

101 Views Asked by At

Let $T = {1,...,m-1}$ be a hash table with open addressing, and for i = 1,...,n with $n \leq \frac{m}{2}$ let $X_i$ be a random variable denoting the number of probes for the i-th entry being added into the hash table.

Show that $P(X_i > k) \leq \frac{1}{2^k}$ for all i = 1,...,n and k being the number of probes. The uniform hashing assumption will hold.

My work so far:

Given $P(X_i > k) = P(X_i \geq k + 1) \leq \frac{E(X_i)}{k+1}$ by Markov's Inequality I need to show that $\frac{E(X_i)}{k+1} \leq \frac{1}{2^k}$.

Given n entries already in T and k possible probes, and the uniform hashing assumption showing n possible collisions I've set the expected value to be

$E(X_i) = \binom{n}{k} * \frac{n}{m} = \frac{n!}{k!(n-k)!} * \frac{n}{m}$

Since $n \leq \frac{m}{2}$ we have the following estimations:

$E(X_i) = \frac{n!}{k!(n-k)!}*\frac{n}{m}$

$\leq \frac{n^n}{k!*n^{n-k}}$

$\leq \frac{(\frac{m}{2})^n}{k!*(\frac{m}{2})^{n-k}} *\frac{1}{2}$

$= \frac{m^k}{k!2^{k+1}}$

Now from here on I'm a bit stuck, since I still have this m value and 2 is raised to the power of k + 1 rather than k. I feel like I'm missing something very basic here.

1

There are 1 best solutions below

0
On

It might be a bit simpler to argue directly. Let's assume we have $m$ open bins (it might make more sense for $T$ to have indices $0, 1, \dots, m-1$), and at time $i \in [1, n]$, you throw a ball into one of the $m$ bins uniformly at random. For the $i$th ball (or entry), there are $i - 1 \leq n$ occupied entries, so the probability of a collision is $(i - 1) / m \leq n /m \leq 1/2$, where the last inequality follows by assumption. Now in order for the event $\{ X_i > k \}$ to occur, we need to have $k$ collisions, each of which occurs with probability no larger than $1/2$. (This should remind you of a Geometric random variable.) You should now be able to conclude from here.