I have file with $10^4$ keys. Every key is from $[0, 10^6).$ How many keys have duplicates on average?
I think this is the same as ask to put $10^4$ balls into $10^6$ urns and then ask how many urns contain more than one ball.
EDIT:
For now i started with the following process: I count expected number of urns with one ball $k$-th balls were used. Say, we have $U$ urns and $B$ balls.
\begin{align*} E_1 &= 1\\ E_2 &= E_1 + \frac{U-1}{U}\\ E_3 &= E_2 + \frac{U - E_2}{U} = E_2 + \frac{(U-1)^2}{U^2}\\ E_4 &= E_3 + \frac{U - E_3}{U} = E_3 + \frac{(U-1)^3}{U^3}\\ \dots\\ E_B &= \sum_{k=0}^{B-1}\frac{(U-1)^k}{U^k} \end{align*}
Ok, now i want to find expected number of empty urn. Probability that an urn stays empty is $\left(\frac{U-1}{U}\right)^B$. So, expected number of empty urns is $E(Z) = U\left(\frac{U-1}{U}\right)^B$.
Thus, number of urns with more than one ball is $U - E_B - E(Z)$.
Thanks to antkam i see that i calculate $E_i$ in a wrong way.