I am a newbie with the Statistic and Probability, and I was encountered with the birthday problem. And here is is the question I have to answer:
Ignoring leap days, the days of the year can be numbered 1 to 365. Assume that birthdays
are equally likely to fall on any day of the year. Consider a group of n people, of which
you are not a member. An element of the sample space Ω will be a sequence of n birthdays
(one for each person).
Call A is the event such that: "There is exist some three people in a group that are share a birthday"
Find the formula to represent the probability of A happens, and find the smallest size of n people such that:
P(A) is larger than 50%
And I had founded the answer for this n is n = 88 or 87 ( because the P(A) is approximate), but the thing that make me wonder is the accurate formula of the P(A).
As I read this source (in the last page):
https://www.physics.harvard.edu/files/sol46.pdf
the probability for the case of exist some group of d people (d < n - total number of people in the group) can be written as
$ 1-\left(1-\frac{1}{365^{\left(d-1\right)}}\right)^{n \choose d} $
=> To explain why this equation, the explanation was the included as a part of the link I embedded above - in the last page)
=> But when I plug the number in (d = 3) and n = 88 -> the probability is return higher than 50 (and the Probability nearly 50% is with the size of 83 people)
=> So my question is.
- What is the correct equation for this probability ?
- Why the equation I given is incorrect ?
- How to get the intuition for this question (the way of thinking to approach this - because really less reference write about this case of the birthday problem)
Thanks in advance
You can solve this particular question with a recursion, without making the false assumption about independence of different days.
Let $p(n,t)$ be the probability that with $n$ people, you have $t$ birthdays shared by two people and $n-2t$ birthdays with just one person, and with no days with three or more people sharing birthdays.
Then by considering what happens with an extra person, you can say $$p(n,t)=\frac{365-t-(n-1-2t)}{365}p(n-1,t) + \frac{n-1-2(t-1)}{365}p(n-1,t-1) \\ = \frac{366-n+t}{365}p(n-1,t) + \frac{n+1-2t}{365}p(n-1,t-1)$$
starting with $p(0,0)=1$ and $p(0,t)=0$ for $t \not=0$.
The probability that with $n$ people there is at least one day with three or more people sharing that birthday is then $$1- \sum\limits_t p(n,t).$$
Doing the calculations with R, the probability is $0.4994549$ when $n=87$ and $0.5110651$ when $n=88$.