Finding the CDF of the Coupon Collectors Problem with non uniform probabilities

115 Views Asked by At

Given a set of $n$ coupons with draw probabilities $p_1, p_2, ..., p_n$ such that $$\sum_{i=1}^{n} p_i = 1$$ What is the probability $P(X \le x)$ of having completed a full set of coupons in fewer than $x$ coupons.

1

There are 1 best solutions below

10
On

If $I\subseteq \{1,2,\dots n\},$ then define $$p(I)=\sum_{i\in I}p_i.$$

Then inclusion-exclusion gives you:

$$P(X\leq x)=\sum_{I\subseteq \{1,\dots,n\}} (-1)^{|I|}(1-p(I))^x$$

This is not an easy formula - it requires $2^n$ terms. But you probably cannot do better.

The formula is much simpler for $p_i=\frac1n$ then $p(I)=\frac{|I|}n.$ Then you can group terms by size and get:

$$\sum_{k=0}^n(-1)^k\binom nk \left(\frac{n-k}n\right)^x$$

If coupon $i$ is much rarer than the others, you can probably estimate it, for $x$ large, with:

$$1-(1-p_i)^x$$

If there are $k$ coupons with probability $p$ and $n-k$ coupons with probability $q,$ with $kp+(n-k)q=1,$ then:

$$P(X\leq x)=\sum_{i=0}^k\sum_{j=0}^{n-k}(-1)^{i+j}\binom{k}{i}\binom{n-k}j(1-ip-jq)^x$$