Coupon Collector seeing exactly n - 1 unique coupons

101 Views Asked by At

I'm wondering whether the following claim is true: for each $n \in \mathbb{N}$, there exists an $m \in \mathbb{N}$ such that with constant probability, $m$ many draws from a uniform distribution over $n$ items results in seeing exactly $n-1$ unique elements.

Stated in terms of the coupon collector problem, I'm asking whether we can collect $m$ many coupons in order to see exactly $n - 1$ unique coupons with probability $\Omega(1)$.

This feels like it should be true, I'm just not sure how to pick $m$. (Something like $n \cdot (\log(n) - 1)$ feels roughly right.) I think that, using inclusion-exclusion, the probability of seeing exactly $n-1$ unique coupons after $m$ draws should be: $$ \dfrac{\sum_{i=1}^{n-1} (-1)^i \cdot \binom{n}{i-1} \cdot (n - i - 1)^m}{n^m}. $$ Now it's just a matter of picking the right $m$ and proving an absolute lower bound to the above quantity. Or maybe there are some nicer tricks?

Thanks for any help.

1

There are 1 best solutions below

1
On BEST ANSWER

Here's a sketch of an idea. A number of details would have to be cleaned up, but I think the gist of it is valid.

For $n$ types of coupons, the probability of $m$ draws yielding exactly $j$ unique coupons may be expressed using Stirling numbers of the second kind:$$ p(n,m,j) = \frac{1}{n^m}{m \brace j}\frac{n!}{(n-j)!}. $$ For $m \approx n \log n$, ${m \brace j}$ is maximal in the neighborhood of $j = n$. So let's assume that the values of ${m \brace n-i}$ are roughly equal for $i = 0$, $1$, $2$, etc. In that case,$$ p(n,m,n-i) \propto \frac{1}{i!} $$ should hold approximately. Assuming this is the case, the sum over $i$ is $e$, so we expect $p(n,m,n-i) \approx 1/(ei!)$ and, in particular,$$ p(n,m,n-1) \approx \frac{1}{e}. $$ If this is true, it answers the question.

Here is some numerical evidence for the key assumption that the Stirling numbers are roughly equal for large $n$. We define $r(n,i) \doteq {m \brace n-i}/{m \brace n}$ with $m = \lceil n \log n \rceil$ and find the following values for $i = 0$ to $10$:$$ \begin{align} n = 30&: 1, 1.059, 0.945, 0.703, 0.432, 0.216, 0.087, 0.028, 0.007, 0.001, 0.0002\\ n = 300&: 1, 1.010, 0.997, 0.962, 0.907, 0.835, 0.751, 0.660, 0.567, 0.475, 0.388\\ n = 3000&: 1, 1.001, 1.000, 0.995, 0.987, 0.977, 0.964, 0.948, 0.929, 0.908, 0.885 \end{align} $$