Coupon collector problem with unequal draw probabilities

146 Views Asked by At

This is an instance of the "coupon collector's problem" in which not all the "coupons" have the same probability of appearance. Here the "coupons" are boxes of cereal, each having a prize in it. I have been struggling with it for some time.

Suppose there are 7 items (e.g., Snow White's 7 dwarfs). One per box of cereal. Except for Dopey, they are distributed evenly over the boxes of cereal. But the number of Dopeys is 1/2 the number of each of the others. The question is: What is the expected value of the number boxes of cereal that should be bought to get a complete set of dwarfs? As an example, if they were all evenly distributed, I get that you would need to buy 1+7/6+7/5+7/4+7/3+7/2+7/1 = 18.15 boxes.

1

There are 1 best solutions below

3
On

Here is a solution based on exponential probability generating functions. Readers interested in learning about generating functions can find many resources here: How can I learn about generating functions?

Number the "coupon" types from $1$ to $7$, with Dopey as type number $1$, and associated probabilities $p_1, p_2, p_3, \dots , p_7$. From the problem statement we find $p_1 = 1/13$ and $p_i = 2/13$ for $2 \le i \le 7$. Let's say $T$ is the number of the draw on which we first have a complete set of coupons.

If $T \le n$ then we have a compete set of coupons on draw $n$ or earlier. The exponential generating function of $P(T \le n)$ is $$f(x) = \prod_{i=1}^7 (e^{p_i x} - 1) = (e^{p_1 x} -1) (e^{p_2 x} -1)^6$$ since $p_i = p_2$ for $i \ge 2$. Since $P(T > n) = 1 - P(t \le n)$, the EGF of $P(T > n)$ is $e^x - f(x)$. We are interested in $P(T > n)$ because by a well-known theorem, $$E(T) = \sum_{n \ge 0} P(T > n)$$ Since we know the EGF of $P(T > n)$, we can relate this infinite sum to an integral. Because $$\sum_{n=0}^{\infty} P(T>n) \frac{1}{n!} x^n = e^x - f(x)$$ and $$\int_0^{\infty} x^n e^{-x} \;dx = n!$$ we have $$\begin{align} E(T) = \sum_{n=0}^{\infty} P(T>n) &= \int_0^{\infty} e^{-x}(e^x - f(x)) \;dx \\ &= \int_0^{\infty} e^{-x}(e^x - (e^{p_1 x} -1) (e^{p_2 x} -1)^6) \;dx \\ & \approx \boxed{20.3579} \end{align}$$ on substituting $p_1 = 1/13$ and $p_2 = 2/13$. I admit to having used a computer algebra system to evaluate the integral, but a paper and pencil computation should not be too difficult.