I have currently finished my semester and I´m trying to learn some stuff on my own. Recently I stumbled upon a problem:
What is the probability that some x distinct numbers are all present in an array of size n, if the array can contain only numbers from this range x, with n much greater than x?
Example
x = 3 (1, 2, 3)
n = 10
array= {1, 2, 2, 1, 3, 2, 1, 3, 2, 3}
Maybe it´s just an easy problem in probability, but just out of curiosity, can someone enlighten me if there is a formula for this problem.
Thanks in advance!
This looks like a good place to apply the Principle of Inclusion/Exclusion.
Let's say the sequence of $n$ numbers has "property $i$" if $i$ does not appear in the sequence, for $1 \le i \le x$. Define $S_j$ to be the total of all the probabilities of the sequences that have $j$ of the properties (i.e., the sequence is missing $j$ numbers), for $1 \le j \le x-1$. Since there are $\binom{x}{j}$ ways to select the set of $j$ missing numbers, and the probability of a sequence of $n$ numbers not containing any of those numbers is $((x-j)/x)^n$, $$S_j = \binom{x}{j} \left( \frac{x-j}{x} \right)^n$$
By Inclusion/Exclusion, the probability of a sequence with none of the properties, i.e. a sequence with no missing numbers, is $$p_0 = 1 + \sum_{j=1}^{x-1} (-1)^j S_j$$