After hashing of how many keys will the probability that any new key hashed collides with an existing one exceed 0.5?

2.4k Views Asked by At

Consider a hash function that distributes keys uniformly. The hash table size is 20.

Here I am getting answer 11 considering the fact that probability will of collision will be 50% more only when we already have 10 entries in the hash table and we are about to map one more entry in the hash table .

Is this approach correct ?

2

There are 2 best solutions below

3
On BEST ANSWER

The probability of collision in each entry will be $\dfrac{1}{20}$

After inserting $k$ values the probability becomes $\dfrac12$

Then we have $\dfrac{1}{20}\times k=\dfrac12$

Therefore, $k=10$

0
On

Let us first analyze the question a bit:

"After hashing of how many keys will the probability that any new key hashed collides with an existing one exceed $0.5$ ? "

I feel the question can be interpreted in two ways.


FIRST

After hashing of how many keys will the probability that any new key hashed NEXT collides with an existing one exceed $0.5$?

or

Find such $k$ for which after hashing $k$ keys the probability that just the next $(k+1)^{th}$ key (any new key hashed) collides any of the $k$ keys inserted till now exceed $0.5$ (without considering the chances of any collision which might have occurred while inserting the first $k$ keys)

The approach for this is as follows.

Since there are $m=20$ slots in the hash table and there are $k$ slots already occupied in the table. So a collision shall occur for the $(k+1)^{th}$ key if it hashes into any of these $k$ slots.

The probability that this occurs is $\frac{k}{m}$. Now this value should be greater than $0.5$. So we have,

$$ \frac{k}{m} > 0.5 $$

$$\implies k> 10 \quad\quad\quad\text{as m=20}$$

Now as $k$ is an integer so , $k$ is at least $11$.


SECOND

After hashing of how many keys will the probability that any new key hashed TILL NOW collides with an existing one exceed $0.5$?

or

Find $k$ such that after hashing $k$ keys, the probability that all keys hashed till now (a key which is inserted for the first time is a new key and it can be any of the $k$ keys inserted till now) collide, which is the same as (1- probability that no two of the first $k$ keys hash to the same slot)$^{\dagger}$ exceeds $0.5$.

From $\dagger$ we clearly get the essence of the birthday problem.

Now,

probability that no two of the first $k$ keys hash to the same slot ($q$) = $\frac{m}{m}.\frac{m-1}{m}...\frac{m-k+1}{m}$

Now required probability $(p) = 1-q$.

Now,

$$p>0.5$$ $$\implies 1-\frac{m}{m}.\frac{m-1}{m}...\frac{m-k+1}{m} >0.5$$ $$\implies 1.\frac{19}{20}...\frac{20-k+1}{20} < 0.5 \quad\text { as m =20}$$

Let $\lambda(k) =1.\frac{19}{20}...\frac{20-k+1}{20}$

Now $\lambda(5) = 0.5814 \nless 0.5 $

$\lambda(6) = 0.43605 \lt 0.5 $

So $k = 6$