Probability of visiting all states at least once in N trials

116 Views Asked by At

Assuming my world is made up of 100 distinct states such that:

$$S = (s1, s2, s3, ... s100)$$

Let's call the number of states 'K'.

And assuming that I perform N trials, wherein each trial I visit one of the states in S randomly, with replacement.

What is the probability that I visit all states at least once in N successive independent trials?

Here's what I have so far: The probability of visiting all states at least once is the inverse of the probability of never visiting a particular state, for each state $s_i$.

$$P(S_{once}) = N * (1 - P(s_i))^{N}$$

However, the math shows I am wrong here. Assuming N = 200, In this example, this works out to: $$P(S_{once}) = 200 * (1 - \frac{1}{100})^{200}$$

Clearly, this seems wrong. For one thing, the probability is greater than 1 at the end of this exercise. My guess is I am assuming some independence here which does not hold true. Can you help me come up with the right formula here, assuming there is one?

Update

I made an error in translating my words to formula:

The probability of visiting all states at least once is the inverse of the probability of never visiting a particular state, for each state $s_i$

This should be represented by the formula: $$P(S_{once}) = K * (1 - P(s_i))^{N}$$

Answer

Thanks to the comments pointing me to the Coupon Collector problem, I have my answer now.

The number of trials T required to visit all K states is:

$$ T = K * \sum_{i=1}^K\frac{1}{i} $$

This is approximately equal to $K ln K$

Reference:https://www.youtube.com/watch?v=3mu47FWEuqA

Note: The video has a typo on the final formula, the summation is from 1 to K (not K-1).