There are 100 balls in a box, 86 are blank, there are 3 balls written 'a', 'b', 'c', 'd' each, and 1 ball written 'e', 'f' each. Every time you draw a ball, it is restored into the box. What is the average(and if it's possible to calculate, median) number of trials one need to get at least once of all abcdef balls?
I looked at coupon collector's problem but it's slightly different from this, so I couldn't figure out how to calculate this.
The average number of trials required to pick the most least likely ball (e or f) is 100, so I thought it might be 100.
But I wrote a quick python code(https://repl.it/Eb5d/0) and empirical result tells me that the answer is 160.
How do I solve this mathematically?
2026-02-23 17:06:26.1771866386
Non-standard coupon collector's problem?
295 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
You can calculate this probability using a sightly modified approach to the one in arxiv/1209.4592.
For the sake of simplicity, I will assume that there are only $3$ different types of balls, which are labeled $1$, $2$ and $3$ and are each is drawn with positive probability $p_1$, $p_2$ and $p_3$. The generalization to more labels is straightforward, but combersome to write down.
Now in this modified problem we want to calculate the average number of draws until we have drawn balls with label $1$ and $2$ (but not necessarily label $3$ - they correspond to the blank balls in your example).
Let $Y_i$ be the number of draws until we have found a ball with label $i$ for the first time. Note that $Y_i$ follows a geometric distribution with parameter $p_i$ and $Y = \max(Y_1, Y_2)$ is the number of draws until we have found a $1$ as well as a $2$.
Since $Y_1$ and $Y_2$ are not independent, we can't directly calculate the distribution of $Y$. But we can calculate the distribution of $\min(Y_1, Y_2)$. Since $$P(\min(Y_1, Y_2) \ge x) = P(Y_1 \ge k, Y_2 \ge k) = P(\text{the first $k$ balls were labeled neither $1$ nor $2$}) = (1 - (p_1 + p_2))^k,$$
we know that $\min(Y_1, Y_2)$ is geometrically distributed with parameter $p_1 + p_2$. Now we can apply the maximum-minimums identity, which states in this case $Y = Y_1 + Y_2 - \min(Y_1, Y_2)$. From this we can calculate the expectation of $Y$: $$E[Y] = E[Y_1] + E[Y_2] - E[\min(Y_1, Y_2)] = \frac{1}{p_1} + \frac{1}{p_2} - \frac{1}{p_1 + p_2}$$
You can now generalize this to your case, where you have $7$ differently labeled balls.