Clarify the solution for the birthday problem with more than 2 people in a group of n people

56 Views Asked by At

I am a newbie with the Statistic and Probability, and I was encountered with the birthday problem. And here is is the question I have to answer:

Ignoring leap days, the days of the year can be numbered 1 to 365. Assume that birthdays
are equally likely to fall on any day of the year. Consider a group of n people, of which
you are not a member. An element of the sample space Ω will be a sequence of n birthdays
(one for each person).

Call A is the event such that: "There is exist some three people in a group that are share a birthday"

Find the formula to represent the probability of A happens, and find the smallest size of n people such that:

P(A) is larger than 50%

And I had founded the answer for this n is n = 88 or 87 ( because the P(A) is approximate), but the thing that make me wonder is the accurate formula of the P(A).

As I read this source (in the last page):

https://www.physics.harvard.edu/files/sol46.pdf

the probability for the case of exist some group of d people (d < n - total number of people in the group) can be written as

$ 1-\left(1-\frac{1}{365^{\left(d-1\right)}}\right)^{n \choose d} $

=> To explain why this equation, the explanation was the included as a part of the link I embedded above - in the last page)

=> But when I plug the number in (d = 3) and n = 88 -> the probability is return higher than 50 (and the Probability nearly 50% is with the size of 83 people)

=> So my question is.

  1. What is the correct equation for this probability ?
  2. Why the equation I given is incorrect ?
  3. How to get the intuition for this question (the way of thinking to approach this - because really less reference write about this case of the birthday problem)

Thanks in advance

1

There are 1 best solutions below

0
On

You can solve this particular question with a recursion, without making the false assumption about independence of different days.

  • Let $p(n,t)$ be the probability that with $n$ people, you have $t$ birthdays shared by two people and $n-2t$ birthdays with just one person, and with no days with three or more people sharing birthdays.

  • Then by considering what happens with an extra person, you can say $$p(n,t)=\frac{365-t-(n-1-2t)}{365}p(n-1,t) + \frac{n-1-2(t-1)}{365}p(n-1,t-1) \\ = \frac{366-n+t}{365}p(n-1,t) + \frac{n+1-2t}{365}p(n-1,t-1)$$

  • starting with $p(0,0)=1$ and $p(0,t)=0$ for $t \not=0$.

  • The probability that with $n$ people there is at least one day with three or more people sharing that birthday is then $$1- \sum\limits_t p(n,t).$$

Doing the calculations with R, the probability is $0.4994549$ when $n=87$ and $0.5110651$ when $n=88$.

days <- 365
p <- matrix(0, nrow=2*days+2, ncol=days+1) # R matrices count from 1 so offset
p[1,1] <- 1   
t <- 1:days              
for (n in 1:(2*days+1)){
  p[n+1,1] <-   (days+1-n) * p[n,1] / days
  p[n+1,t+1] <- ((days+1-n+t) * p[n,t+1] + (n+1-2*t) * p[n,t]) / days  
  }
probdaysharedbyatleast3 <- 1-rowSums(p)[-1]                   # remove offset
probdaysharedbyatleast3[87:88]
# 0.4994549 0.5110651