Birthday problem: why is this solution wrong?

1k Views Asked by At

This question is about the birthday problem: the probability that in a group of n people, at least two of them have the same birthday (https://en.wikipedia.org/wiki/Birthday_problem).

An easy way to calculate the probability is to calculate first the probability that no two people have the same birthday.

Let's say that I want to calculate the probability that in a group of 20 people, NO two people have the same birthday.

So, for 20 people with different birthdays, I can choose the first birthday in 365 ways, the second in 364 ways and so on, while for 20 people who can have the same birthday, I can choose the first birthday in 365 ways, the second also in 365 ways, and so on. At the end:

$p={365 \cdot 364 \cdot 363 \cdot ... \cdot 346 \over 365 \cdot 365 \cdot 365 \cdot ... \cdot 365}\approx 0.59$

This is the right method and I understand it. I don't understand why the following method is wrong:

The probability that in a group of 20 people NO two people have the same birthday is the ratio between the combination without repetition of 20 birthdays and the combination with repetition of 20 birthdays:

$$p={C_{365,20} \over C'_{365,20}}={\binom{365}{20} \over \left(\binom{365}{20}\right)}={\binom{365}{20} \over \binom{365+20-1}{20}}=\frac{{{365!}\over {20!\cdot(365-20)!}}}{{(365+20-1)!}\over{20!\cdot{(365-1)!}}}\approx 0.35$$

I understand the mistake is in the denominator (as the numerator is the same of the other method, after simplifying that 20!), but why? Isn't it right to calculate the k-combination with repetition of 20 birthdays?

Thanks for helping!

1

There are 1 best solutions below

3
On BEST ANSWER

As you say, the problem is in the denominator

The number of equally probable ways of choosing $k$ distinguishable items without repetition from $n$ is $\frac{n!}{(n-k)!} = k!{n \choose k}$. If the items were not distinguishable, there would be ${n \choose k}$ ways, as you have in your numerator

The number of equally probable ways of choosing $k$ distinguishable items with repetition from $n$ is $n^k$ and this is the correct denominator to use in a counting calculation of probability

Meanwhile your ${n+k-1 \choose k}$ represents the number of ways of choosing $k$ indistinguishable items with repetition from $n$. But these ways are not (in the real world) equally probable and so cannot be used in a simple counting calculation of probability. As a simple example, flipping two coins, an unordered one heads and one tails outcome is twice as probable as a two heads outcome