Expected amount of repeats in a random sequence of integers

3.7k Views Asked by At

I'm looking at a series of random integers generated by a CSPRNG and noticed that there are more repeats (that is a number is in the sequence 2 or more times e.g. 9,3,8,5,6,3 - 3 is a repeat) than I expected.

I generated 10,000 numbers, each between 1 and 100,000, this resulted in 9,516 unique numbers. Does this seem correct, and if so, how would I calculated the expected about of unique numbers for n random numbers of a range 1 to x?

3

There are 3 best solutions below

1
On BEST ANSWER

When sample $n$ times from the set $\{1,\dots, x\}$, then the expected number of unique values is $x[1-(1-1/x)^n]$. With $n=10000$ and $x=100000$, this gives approximately $9516.303$.

0
On

Your question is strongly related to the birthday paradox. Calculating the expected value of unique numbers could be done, e.g., by using combinatorial arguments. But as to your question: Yes, such a number indeed seems to be plausible.

1
On

The probability a given single number does not appear at least once is $\left(1-\frac{1}{100000}\right)^{10000} \approx e^{-1/10}$

so the expected number not appearing is $100000$ times this, near $90483.7$,

making the expected number of unique numbers about $9516.3$, surprising close to what you observed.