Is there a name for this distribution? (generating until all numbers appear)

169 Views Asked by At

A computer program generates a number between $1$ and $n$ every second ( there is a $1/n$ probability for each number ). It continues until all $n$ possible numbers were generated.

Let $X$ be the number of seconds it took for the program to generate all different numbers. Is there a name for the distribution of $X$ ?. I couldn't find it anywhere.

Also, I found that $ \displaystyle{\,\mathrm{P}\left(X = k\right) = \sum_{i = 0}^{n}\left(-1\right)^{i + 1}{n - 1 \choose i - 1} \left(1 - \frac{i}{n}\right)^{k - 1}} $. Could someone verify that ?.

Thank you!

1

There are 1 best solutions below

3
On BEST ANSWER

As Matthew Conroy mentions in a comment, this is related to the Coupon Collector's Problem. I am not sure if this distribution has a name, but here is a derivation of the probability given in the question.


Probability of Completion on the $\boldsymbol{m^\text{th}}$ Trial

Let $S_j$ be the set of arrangements where $j$ has not been chosen after $m$ trials. The sum of the probabilities of all intersections of $k$ of the $S_j$'s is $$ \overbrace{\ \ \ \binom{n}{k}\ \ \ }^{\substack{\text{number of ways}\\\text{to choose $k$}\\\text{particular numbers}}}\overbrace{\left(1-\frac kn\right)^m}^{\substack{\text{probability of}\\\text{$k$ particular}\\\text{numbers not}\\\text{being chosen}\\\text{after $m$ trials}}} $$ Inclusion-Exclusion says that the probability of not getting some number after $m$ trials is $$ \sum_{k=1}^n(-1)^{k-1}\binom{n}{k}\left(1-\frac kn\right)^m $$ Thus, the probability of getting the last number on the $m^\text{th}$ trial is $$ \begin{align} &\sum_{k=1}^n(-1)^{k-1}\binom{n}{k}\left[\left(1-\frac kn\right)^{m-1}-\left(1-\frac kn\right)^m\right]\\ &=\sum_{k=1}^n(-1)^{k-1}\binom{n}{k}\frac kn\left(1-\frac kn\right)^{m-1}\\ &=\bbox[5px,border:2px solid #C0A000]{\sum_{k=1}^n(-1)^{k-1}\binom{n-1}{k-1}\left(1-\frac kn\right)^{m-1}} \end{align} $$


Expected Duration (Using the Formula Above)

We can compute the expected duration using the formula above $$ \begin{align} &\sum_{m=1}^\infty\sum_{k=1}^n(-1)^{k-1}\binom{n-1}{k-1}m\left(1-\frac kn\right)^{m-1}\\ &=\sum_{k=1}^n(-1)^{k-1}\binom{n-1}{k-1}\frac1{\left(1-\left(1-\frac kn\right)\right)^2}\\ &=n\sum_{k=1}^n(-1)^{k-1}\binom{n}{k}\frac1k\\ &=n\sum_{k=1}^n(-1)^{k-1}\sum_{j=k}^n\binom{j-1}{k-1}\frac1k\\ &=n\sum_{j=1}^n\sum_{k=1}^j(-1)^{k-1}\binom{j}{k}\frac1j\\ &=n\sum_{j=1}^n\frac1j\\[6pt] &=\bbox[5px,border:2px solid #C0A000]{nH_n} \end{align} $$


Expected Duration (Summing Expected Durations)

If a stream of independent events occurs, each with probability of success $p$, the expected number of events until a success is $\frac1p$. The probability of picking a new number after we have picked $k$ distinct numbers is $\frac{n-k}{n}$. Therefore, the duration after we've picked $k$ distinct numbers until we pick the $k+1^\text{st}$ number is $\frac{n}{n-k}$. Thus, the expected duration until we pick all numbers is $$ \begin{align} \sum_{k=0}^{n-1}\frac{n}{n-k} &=\sum_{k=1}^n\frac nk\\ &=\bbox[5px,border:2px solid #C0A000]{nH_n} \end{align} $$