What is the average number of draws (2 cards per draw with shuffles in between) before I had seen all 52 cards in the deck?

898 Views Asked by At

On average, how many times would I have to draw two cards from a deck (replacing and shuffling between each draw) before I had seen each of 52 cards in the deck?

The process is:

  • Draw the top two cards of a shuffled deck and note their face values.
  • Place those two cards back in the deck and shuffle.
  • Draw the top two cards and note their values

Statistically, how may times would I have to repeat that process until I had noted every card in the deck?

1

There are 1 best solutions below

1
On

The average number of draws required to see every card in the 52-card deck is $$ \langle t_0 \rangle = \frac{155457152493897006250124899500414238955093}{1327631579600482691104433390359453744200} \approx 117.0936... $$


To see this, we have to use the language of Markov chains. For generality, let's assume there are $n$ cards in the deck; we'll focus on the $n = 52$ case later. Let's suppose that we have already seen $i$ cards, and we flip over two more cards at random. The probability that they are both cards we've already seen is $$ p_{i \to i} = \frac{i(i-1)}{n(n-1)}; $$ the probability that we've seen neither card (and therefore that we have seen $i+2$ cards after this draw) is $$ p_{i \to i+2} = \frac{(n-i)(n-i-1)}{n(n-1)}; $$ and the probability that we will have seen one of the cards but not the other is $$ p_{i \to i+1} = 2 \frac{(n-i)i}{n(n-1)}. $$ (The factor of two comes from the fact that the "new" card can be either the first or the second one we flip over.) As one can check, these three probabilities always add up to 1.

Now, the question is: suppose we have seen $i$ cards already. How many more two-card flips will it take for us to see all $n$ cards? Let's denote this number by $t_i$; the term of art for this is the "mean first passage time" to the state where we've seen all $n$ cards. If we do another two-card draw, then we can go to a state where we have seen $i$, $i+1$, or $i+2$ cards, with the above probabilities for each option. Now, here's the trick: it must be the case that the number of draws needed to go from state $i$ to our final state is 1 plus the number of draws needed to get to the final state from each "daughter" of state $i$ (i.e., each state that can be obtained from state $i$), with each number weighted by the probability above. In our case, this can be expressed as $$ t_i = 1 + p_{i \to i} t_i + p_{i \to i+1} t_{i+1} + p_{i \to i+2} t_{i+2} $$

This gives us a recurrence relation between the amounts of time $t_i$ needed to go from seeing $i$ cards to seeing $n$ cards. We can define $t_{52} = 0$; the question is then, what is $t_0$? In principle, you could solve this by hand—it's a system of 52 equations and 52 unknowns, for $i = 0$–$51$—but it's going to be much easier to feed this into a computer algebra program and let it hack through the algebra. I used Mathematica, and the result for $n = 52$ is given above. (The code is below, if anyone's interested.)

draws[nval_] := Module[{n = nval, probmat, tvec}, 
  probmat = Table[Which[j == i, ( i (i - 1))/(n (n - 1)), j == i + 1, (2 i (n - i))/(n (n - 1)), j == i + 2, ((n - i) (n - i - 1))/( n (n - 1)), True, 0], {i, 0, n}, {j, 0, n}];
  tvec = Append[Table[t[i], {i, 0, n - 1}], 0];
  t[0] /. First[Solve[Thread[Drop[tvec, -1] == ConstantArray[1, n] + Drop[probmat.tvec, -1]], Drop[tvec, -1]]]
  ]

To answer a question from the comment section for the original question: this problem is closely related to the "Coupon Collector's Problem", as noted by Brian Tung. However, in this version one is guaranteed that the second card one draws will not be a duplicate of the first one. This would be like a version of the original coupon collector's problem in which one is guaranteed that every even-numbered coupon one receives will not be a duplicate of the odd-numbered one preceding it. There is a small but non-zero chance of this happening in the original Coupon Collector's problem; since the current problem eliminates these "wasted" draws, we can conclude that the number of coupons/cards one is required to look at must be slightly smaller in the present case. For the original coupon collector's problem with $n = 52$, we have $t_0 \approx 236$; so in the present case, we expect $t_0 < 236/2$ (which it is.)