Simulated probability doesn't match closed-form answer

50 Views Asked by At

I am trying to calculate the probability of a hand of 13 cards containing at least five pairs. Three of a kind counts as one pair; four of a kind counts as two pair.

I simulated millions of hands and found empirically that the probability is about 9.3%. I did this in two different programming languages using a few different RNGs for shuffling. The C++ code is linked at the bottom of this post.

However, the following reasoning implies that this is wrong:

Choose five values (out of 13). For each of them, choose two cards to make the pair from. This gets you ten cards which consist of five distinct pairs. Then choose 3 cards from the remaining 42 to fill out the rest of hand.

Thus the number of ways to make five pair is: ${13 \choose 5} {4 \choose 2}^5 {42 \choose 3}$. So the probability should be: $$\frac{{13 \choose 5} {4 \choose 2}^5 {42 \choose 3}}{{52 \choose 13}}.$$

Actually, this doesn't take into account the possibility of making quads (it assumes all the values are distinct). But it should at least give a lower bound.

Putting the following expression into Wolfram Alpha or Google shows that the probability is about 18.09% : "((13 choose 5) * (4 choose 2)^5 * (42 choose 3)) / (52 choose 13)"

What is the reason for this discrepancy? Is my reasoning wrong, or my code?

C++ simulation code: http://pastebin.com/DGh63B5b

1

There are 1 best solutions below

1
On BEST ANSWER

Your combinatorial formula is triple counting hands that contain three of a kind.

For example, a hand containing the Queens of Clubs, Spades, and Hearts is, by your formula, counted once with the Club and Spades as one of the pairs and the Heart among the remaining $42$ cards, again with the Club and Heart as one of the pairs and the Spade among the remaining $42$ cards, and once more with the Spade and the Heart as one of the pairs and the Club among the remaining $42$ cards.

A correct lower bound would come from ${13\choose5}{4\choose2}{32\choose3}$, which counts the number of ways to form a hand with $5$ pairs of distinct values and no three of a kinds (i.e., the final three cards come from the $32$ cards that don't match any of the $5$ pairs). This gives a probability of about $7.8\%$, which is less than what your simulation produces.