Coupon Collector Problem: How to derive a recurrence for expected time.

118 Views Asked by At

Let $T_{i,j}$ denote the expected time taken to get $j$ distinct coupons given that one already has $i$ distinct coupons. Say that we have $n$ distinct coupons, then I want to find a recurrence involving $T_{i,j}$ and $T_{i+1,j}.$

I thought of the following: If one has $i$ distinct coupons then one could first get $1$ distinct coupon in expected time $T_{i,i+1}$ and then get the remaining $j-1$ distinct coupons in expected time $T_{i+1,j-1}.$ Thus giving us the following recurrence:

$$T_{i,j}=T_{i,i+1}+T_{i+1,j-1}.$$

If this is correct, then I want to write a recursive program to get values for $T_{i,j}$ and so I thought of the following base cases:

$$T_{i,i} = 0,T_{i,0}=0,T_{0,j} = jH_{j}$$

where $H_j$ is the $j$'th Harmonic number. I don't know if this is correct, so any inputs will be much appreciated.

1

There are 1 best solutions below

0
On

I want to clarify the notation: I will write that $T_{i,j}$ is the expected time taken to hold $i + j$ distinct coupons in total given that one already holds $i$ distinct coupons. You use both this definition and the definition "$T_{i,j}$ is the time to get $j$ distinct coupons in total given that one has $i$ distinct coupons" in your recurrence formula. Using the first definition the recurrence formula generated from conditioning on getting a single new distinct coupon would be $$T_{i,j} = T_{i,1} + T_{i+1,j-1},$$ and with the second definition it would be $$T_{i,j} = T_{i,i+1} + T_{i+1,j}.$$ As I write this, yours is a mixture of both. I'm assuming that you meant the first definition since you mentioned the term $T_{i+1,j-1}$ specifically in your question.

Let $S_{i,j}$ be the (random) time taken to get $i + j$ distinct coupons given that one already has $i$ distinct coupons. Then $$S_{i,j} = S_{i, 1} + S_{i+1, j-1}$$ because we must reach $i + 1$ distinct coupons before we reach $j$ coupons (unless $j = 1$ in which case $S_{i +1 , j - 1} = 0$).

The recurrence $T_{i,j} = T_{i, 1} + T_{i + 1,j-1}$ is correct because $\mathbb{E}(S_{i,j}) = T_{i,j}$, and so $$T_{i,j} = \mathbb{E}(S_{i,j}) = \mathbb{E}(S_{i, 1}) + \mathbb{E}(S_{i + 1,j-1}) = T_{i,j} + T_{i+1,j-1}.$$

I agree that $T_{i,0} = 0$ and that $T_{0,j} = jH_j$. We also have $T_{i,1} = n/(n - i)$, since the distribution of coupons drawn is geometric with parameter $(n - i)/n$.

A solution to the recurrence in terms of the Harmonic numbers is given by $$T_{i,j} = T_{i, 1} + T_{i + 1,j-1} \qquad \qquad \qquad \qquad \qquad \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace$$ $$ = T_{i, 1} + T_{i + 1, 1} + T_{i + 2,j - 2} \qquad \qquad \qquad \thinspace \thinspace \thinspace \thinspace \thinspace$$ $$ = T_{i, 1} + T_{i + 1, 1} + \cdots + T_{i + j - 1, 1} \qquad \qquad \thinspace \thinspace \thinspace \thinspace \thinspace$$ $$ = \frac{n}{n - i} + \frac{n}{n - i - 1} + \cdots + \frac{n}{n - i - j + 1}$$ $$ = n(H_{n - i} - H_{n - i - j}). \qquad \qquad \qquad \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace \thinspace$$