Difficulty in understanding the probability exactly one type i in the final collection out of n types

1.3k Views Asked by At

It is Question 63 on page 174 in Ross's book (Introduction to Probability Models-11th edition)

Suppose that there are n types of coupons, and that the type of each new coupon obtained is independent of past selections and is equally likely to be any of n types. Suppose one continues collecting until a complete set of at least one of each type is obtained.

(a) Find the probability that there is exactly one type i coupon in the final collection.

Hint: Condition on T, the number of types that are collected before the first type ii appears.

The same question can be found here, but I do not have enough credit to comment on. A conditional probability problem on coupon collection

The Solution Says:

Let Si be the event there is only one type i in the final set.

P{Si = 1}

= $\sum_{j=0}^{n-1}$ P{Si=1|T=j}P{T=j}

= $1/n$ $\sum_{j=0}^{n-1}$ P{Si=1|T=j}

= $1/n$ $\sum_{j=0}^{n-1}$ $1/(n-j)$

The final equality follows because given that there are still n−j−1
uncollected types when the first type i is obtained, the probability starting
at that point that it will be the last of the set of n−j types consisting of
type i along with the n−j−1 yet uncollected types to be obtained is, by 
symmetry, 1/(n−j).

Here are my questions:

  1. Why $\sum_{j=0}^{n-1}$ P{Si =1|T=j}P{T=j} = $\sum_{j=0}^{n-1}$P{Si=1|T=j}?

  2. How to understand P{Si=1|T=j} = $1/(n-j)$ ? Especially when $j = 0$, it is $1/n$? Does that mean the whole branch starting from coupon i is $1/n$? If it is the case, I think case like the instance i-i-... is invalid.

Your help will be greatly appreciated.

1

There are 1 best solutions below

2
On BEST ANSWER

For part 1 of the question, $$ \sum_{j=0}^{n-1} P\{S_i =1\mid T=j\}P\{T=j\} \neq \sum_{j=0}^{n-1} P\{S_i=1\mid T=j\},$$ and the solution never claimed that those two quantities are equal. What the solution says is that $P\{T=j\} = \frac1n,$ and therefore \begin{align} \sum_{j=0}^{n-1} P\{S_i =1\mid T=j\}P\{T=j\} &= \sum_{j=0}^{n-1} P\{S_i =1\mid T=j\}\times\frac1n \\ &= \frac1n\sum_{j=0}^{n-1} P\{S_i =1\mid T=j\}. \end{align} You omitted the factor $\frac1n$.

For part 2, let's consider the case $j=0.$ In that case we want to evaluate $P\{S_i = 1 \mid T=0\}.$ The event $T=0$ means that the collector has received no coupons of any other type at the time they receive their first coupon of type $i.$ The only way for this to happen is if the very first coupon received is a coupon of type $i.$

So now we have established that we are looking for the probability that $S_i = 1,$ given that the very first coupon is of type $i.$ The "given" means we can consider "first coupon is of type $i$" as an event that has already happened. We will then have $S_i = 1$ if and only if the following chain of events happens:

  • Starting from time $t_i$ (the moment after receiving the first coupon), the collector continues to collect coupons;
  • After time $t_i,$ the collector collects at least one of each of the other $n - 1$ types of coupons;
  • Between time $t_i$ and the time when the collector collects the last of the other $n - 1$ types of coupons, the collector does not receive any coupon of type $i.$

Let's call this chain of events $A_1.$ Then $P\{S_i = 1 \mid T=0\} = P\{A_1\mid T=0\}.$

Now let's compare that chain of events to the following chain of events that occur to a different collector who starts collecting coupons at time $t_i$ (that is, at time $t_i$ the second collector has no coupons at all):

  • Starting from time $t_i,$ the collector collects coupons;
  • After time $t_i,$ the collector collects at least one of each of the $n - 1$ types of coupons that are not of type $i$;
  • Between time $t_i$ and the time when the collector collects the last of the other $n - 1$ types of coupons, the collector does not receive any coupon of type $i.$

Let's call this chain of events $A_2.$

If the second collector also is motivated to collect all $n$ types of coupons, they will continue to collect coupons until they have received a coupon of type $i,$ which will certainly happen eventually; and then type $i$ will be the last type of coupon that collector collects. So $P\{A_2\mid T=0\}$ is exactly the probability type $i$ will be the last type of coupon collected by a collector who collects all $n$ types of coupons (starting at time $t_i$), given that $T=0.$

Now, the chance that type $i$ will be the last type you collect is independent of when you start collecting, so we can ignore the part about "starting at time $t_i.$" The given fact that $T = 0$ likewise has no effect on the chances of the second collector to collect coupons; it is merely something that happened to the first collector. So we're down to just, "What is the chance that type $i$ is the last of the $n$ types collected?" And that probability is $\frac1n$ by symmetry; that is, $P\{A_2\mid T=0\} = \frac1n.$

But now compare the chain of events $A_1$ to the chain of events $A_2.$ In both events we have a collector collecting all types of coupons except type $i$ between time $t_i$ and some later time. There is nothing to distinguish between the probabilities of those two events; $P\{A_1\mid T=0\} = P\{A_2\mid T=0\}.$ Therefore $P\{A_1\mid T=0\} = \frac1n,$ and therefore $P\{S_i = 1 \mid T=0\} = \frac1n.$

For $j > 0$ there is an additional subtlety: after $t_i$ (when the collector gets the first coupon of type $i$), we ignore the existence of the $j$ types of coupons collected before $t_i.$ With regard to the question of the collector collects all the other $n - j - 1$ types of coupon before getting another coupon of type $i$ after time $t_i,$ the receipt of one of those $j$ types coupons has as much impact as receiving a postcard from Aunt Sally. So we treat the problem as one involving only $n - j$ types of coupon, namely type $i$ and the $n - j - 1$ types not yet collected, and we want the probability of collecting all of the other $n - j - 1$ types before collecting (the next) type $i.$ That probability is $\frac1{n-j}$ for the same reasons as in the case $j = 0.$