Simplified Coupon Collector's Problem

149 Views Asked by At

I am trying to understand the coupon collector's problem, so I am hoping to start with a very basic example, with real numbers, and understand it. There are several examples on this site, but never any practical applications. The theory and formulas are great, but it can also be helpful to use real numbers to show how to properly use them.

Let's say there are 2 types of coupons, with a non-uniform distribution. To make it easy, let's say that statistically, there are 99 normal coupons for every 1 rare coupon.

From what I understand, for a normal distribution, the expected number is $nH_n$, where $H_n$ is the nth Harmonic number. In this case, that is $3/2$, so the expected number of tickets needed to get both would be $2\cdot \frac 32 = 3$.

For this special case, I believe it would be governed by the equation in the attached image (from Wikipedia: https://en.wikipedia.org/wiki/Coupon_collector%27s_problem)

Formula for nonuniform coupon collector estimated value

I am not sure how to apply this to my simple example. When I apply it, I get either positive or negative 98.8989... It seems obvious that the answer is the positive, but I must be missing something with the negative, or the order in which the probabilities need to be used in the equation? Also, not even sure this is the right answer, as I feel like it should be higher.

Edit: Adding how I arrived at my answer, as I am apparently using this formula incorrectly.

$E(T) = \sum_{q=0}^{m-1} (-1)^{m-1-q} \sum_{|J|=q}1/(1-P_J)$.
Since $m=2$, we have $E(T) = \sum_{q=0}^{2-1} (-1)^{2-1-q} \sum_{|J|=q}1/(1-P_J)$. $E(T) = \sum_{q=0}^{1} (-1)^{1-q} \sum_{|J|=q}1/(1-P_J)$.

I am admittedly not remembering how to handle two the two sums, but I guess I assumed multiplication:

For $q=0$, $E(T) = (-1)^1 \times (\frac{1}{1-0.99}) = -100$

For $q=1$, $E(T) = (-1)^0 \times (\frac{1}{1-0.01}) = 1.0101010101$

Adding those together I get $\pm 98.9898,$ depending on the order I put them.

This must be wrong, since the answer can't be negative, but not sure what I am doing wrong.

3

There are 3 best solutions below

1
On BEST ANSWER

You seem to have misunderstood the second sum. It runs over all subsets of coupons that contain $q$ coupons; $|J|=q$ means that the cardinality (the number of elements) of the subset $J$ is $q$.

So there’s one subset with $0$ coupons, the empty set, and that contributes $(-1)^{1-0}\cdot\frac1{1-0}=-1$, and there are two subsets with $1$ coupon each, and these contribute

$$(-1)^{1-1}\left(\frac1{1-\frac1{100}}+\frac1{1-\frac{99}{100}}\right)=\frac{100}{100-1}+\frac{100}{100-99}=\frac{100}{99}+100=\frac{10000}{99}\;,$$

for a total of

$$\frac{10000}{99}-1=\frac{9901}{99}\;. $$ About the downvoting and closing of your question (neither of which I was involved in): If you want to avoid this in the future, don’t make your questions rely on images for essential content – see Why image cannot be used for explaining my maths problem?.

0
On

Let's say there are $2$ types of coupons, with a non-uniform distribution. To make it easy, let's say that statistically, there are $99$ normal coupons for every 1 rare coupon.

Personally, when I encounter a non-standard application of Math analysis, I simply go back to basics, and try to re-invent the wheel. That is, I ignore canned formulas, and attack the problem from scratch.

MathSE reviewers more experienced in this area than I am will probably be able to avoid my (inelegant) approach. Anyway...

Either:

  • Case 1 - the first coupon is the rare one: probability = $(1/100)$

  • Case 2 - or it isn't: probability = $(99/100)$.


In general, if you are going after a single coupon, from day 1, with probability$~ = p,~$ and $~q = (1-p)~$, then the expected time required is

$$(1)p + [2qp] + [3q^2p] + \cdots = p\left[\frac{1}{(1-q)^2}\right] = \frac{1}{p}.$$

In Case 1, with the probability of the missing normal coupon at $(99/100)$, it will take $(100/99)$ more attempts.

In Case 2, with the probability of the missing rare coupon at $(1/100)$, it will take $(100)$ more attempts.

So, the total expected time will be

$$\left\{ ~\left[\frac{1}{100}\right] \times \left[1 + \frac{100}{99}\right] ~\right\} + \left\{ ~\left[\frac{99}{100}\right] \times \left[1 + 100\right] ~\right\}$$

$$= \frac{199}{9900} + \frac{9999}{100} = 100.\overline{01}.$$


Addendum

I manually calculated

$$1 + 2q + 3q^2 + \cdots = \frac{1}{(1-q)^2} ~: ~0 < q < 1$$

via

$(1 + q + q^2 + \cdots)$

$+ ~q(1 + q + q^2 + \cdots)$

$+ ~q^2(1 + q + q^2 + \cdots)$

$+ ~\cdots$

$$= (1 + q + q^2 + \cdots) \times (1 + q + q^2 + \cdots).$$

An alternative approach is

$$E = p(1) + q(1 + E) \implies (1-q)E = p + q = 1 \implies E = \frac{1}{p}.\tag1 $$

The LHS of (1) above is explained as:

the first attempt will succeed with probability $p$, and fail with probability $q$. When it fails, you should expect the time required to then be $1 + E$, because after the first failed attempt, you are back where you started.

0
On

After further research, I was able to determine my mistake.

Forgive my poor wording, but the second sum notation is the sum of the probabilities of choosing each type of coupon in the set $J$ of size $q$. What this means is that for $q=0$, the probability of drawing any coupon is $0$, so Pj is just $0$. For $q=1$, I need to add $1/(1-Pj)$ for each type of coupon. In my example, it would be $1/(1-0.99)$ and $1/(1-0.01), the probabilities of getting each type of coupon.

Here is the correct solution for this example. $E(T) = \sum_{q=0}^{m-1} (-1)^{m-1-q} \sum_{|J|=q}1/(1-P_J)$.

Since $m=2$, we have

$E(T) = \sum_{q=0}^{2-1} (-1)^{2-1-q} \sum_{|J|=q}1/(1-P_J)$

$E(T) = \sum_{q=0}^{1} (-1)^{1-q} \sum_{|J|=q}1/(1-P_J)$

For $q=0$, $E = (-1)^{1-0} \times (\frac{1}{1-0}) = -1 \times 1 = -1$

For $q=1$, $E = (-1)^{1-1} \times ((\frac{1}{1-0.99}) + (\frac{1}{1-0.01})) = (1) \times (100 + 1.0101) = 101.\overline{01}$

Adding these together we get the correct answer

$E(T)=-1+101.\overline{01}=100.\overline{01}$

Edit: This was posted before I saw the correct answer provided.