Expected value of first duplication by sampling elements with replacement

24 Views Asked by At

Let $G$ be a group of order $m$. I am sampling elements $a_1, a_2, \ldots$ from $G$ with replacement - random and evenly distributed. The question is, when will there be the first duplicate, i.e. the minimum $j$ for which there exists a $i < j$ with $a_i = a_j$.

I thought of it in the following way: The probability of having a duplicate in the $k$-th run is

$$\frac{m}{m} \frac{m-1}{m} \cdots \frac{m-k+2}{m} \frac{k-1}{m} = \frac{m! \cdot (k-1)}{(m-k+1)! \cdot m^k}$$

that means "$k-1$ times no duplicate and a duplicate in the $k$-th run, so the expected value of samples before a match appears is

$$\sum_{k=1}^{m+1} (k-1) \frac{m! \cdot (k-1)}{(m-k+1)! \cdot m^k}.$$

Here, in theorem 2.2 it is stated that the expected number of evaluations before a match appears is $\sqrt{\pi m/2}$.

Checking the result with $m = 50$ gives me $\approx 8.54$ using my formula (see here) and $5\sqrt{\pi} \approx 8.86$ using the formula from the article. Why is there a difference resp. where is the mistake I (?) made?