Why can't I use a direct approach in the birthday problem?

67 Views Asked by At

In a variation of the birthday problem, we are trying to find the probability that another student in the class shares our birthday. We can use the complement rule to find the probability that a student does not share the same birthday as us.

P(S) = 1 - P(*S) = 1 - (P(*S1) * P(*S2))

Can’t we find the probability of having the same birthday as us and add up to the number of students, i.e

P(S) = P(S1) + P(S2)

Is there anything wrong with this approach?

For example, with two students, the Complement method

1 - ((364/365) * (364/365)) = 0.005471945956089

Direct method

1/365 + 1/365 = 2/365 = 0.005479452054795

With 100 students
Complement method = 0.2399329261841
Direct method = 0.273972602739726

The answers are different.

Can someone clarify what I am missing here?

1

There are 1 best solutions below

0
On

The direct approach, although not recommended, is do-able, via Inclusion-Exclusion. This approach provides a remedy to the issue of over-counting raised by the comment of Henry.

See this article for an introduction to Inclusion-Exclusion. Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.

Let $~Q~$ denote the set of all possible distributions of the birthdays of the $~n~$ people, without any regard for whether any of the people share a birthday with you.

Let $~S~$ denote the subset of $~Q~$ where at least one person shares a birthday with you.

Then the desired computation of the probability is

$$\frac{|S|}{|Q|} ~: ~|Q| = (365)^n,$$

So, the problem reduces to computing $~|S|,~$ which represents the number of elements in the set $~S.$


Index the $~n~$ people $~P_1, P_2, \cdots, P_n.$

For $~k \in \{1,2,\cdots,n\},~$ let $~S_k~$ denote the subset of $~S~$ such that person $~P_k~$ shares a birthday with you. For example, $~S_1~$ represents the set of all possible birthday distributions, where person $~P_1~$ shares a birthday with you, and any of the other people may or may not share a birthday with you.

Then, since $~( ~S_1 \cup S_2 \cup \cdots \cup S_n ~) = S,~$ the problem reduces to computing

$$| ~S_1 \cup S_2 \cup \cdots \cup S_n ~|.$$


Let $~T_1~$ denote $ ~|S_1| + |S_2| + \cdots + |S_n|.$

For $~r \in \{2,3,\cdots,n\},~$ let $~T_r~$ denote

$\displaystyle \sum_{1 \leq i_1 < i_2 < \cdots < i_r \leq n} | ~S_{i_1} \cap S_{i_2} \cap \cdots \cap S_{i_r} ~|.$

That is, $~T_r~$ represents the sum of $~\displaystyle \binom{n}{r}~$ terms.

Then, in accordance with Inclusion Exclusion theory,

$$| ~S_1 \cup S_2 \cup \cdots \cup S_n ~| = \sum_{r=1}^n (-1)^{r+1} T_r.$$

So, the problem reduces to computing $~T_1, T_2, \cdots, T_n.~$


$\underline{\text{Computation of} ~T_1}$

$S_1~$ represents the set where $~P_1~$ shares your birthday, and all of the other people are unrestricted.

Therefore, $~|S_1| = (365)^{n-1}.~$

By Symmetry, for each $~k \in \{2,3,\cdots,n\},~$ you have that $~\displaystyle |S_k| = |S_1|.~$

Therefore,

$$T_1 = \binom{n}{1} (365)^{n-1}.$$


$\underline{\text{Computation of} ~T_2}$

$( ~S_1 \cap S_2 ~)~$ represents the set where $~P_1~$ and $~P_2~$ both share your birthday, and all of the other people are unrestricted.

Therefore, $~|S_1 \cap S_2| = (365)^{n-2}.~$

By Symmetry, for each
$~1 \leq i_1 < i_2 \leq n ~: ~i_1, i_2 \in \Bbb{Z},$
you have that $~\displaystyle |S_{i_1} \cap S_{i_2}| = |S_1 \cap S_2|.~$

Therefore,

$$T_2 = \binom{n}{2} (365)^{n-2}.$$


$\underline{\text{Computation of} ~T_r ~: 3 \leq r \leq n}$

The analysis in this section will parallel the analysis of the previous section.

There will be $~\displaystyle \binom{n}{r}~$ terms,

and each term will equal $~(365)^{n-r}.$

Therefore,

$$T_r = \binom{n}{r} (365)^{n-r}.$$


$\underline{\text{Final Computation}}$

$$\forall r \in \{1,2,\cdots, n\}, ~T_r = \binom{n}{r} (365)^{n-r}.$$

$$ |S| = \sum_{r=1}^n (-1)^{r+1} ~T_r.$$

The probability is

$$\frac{|S|}{|Q|} ~: ~|Q| = (365)^n.$$


$\underline{\text{Addendum}}$

By binomial expansion, the numerator of the final answer represents

$$(365)^n - \left[ ~365 - 1 ~\right]^n.$$

This implies that the final computation is

$$\frac{(365)^n - (364)^n}{(365)^n}. \tag1 $$

The expression in (1) above makes perfect sense, since $~(364)^n~$ represents the number of ways that the $~n~$ people can each avoid sharing your birthday.