Probability of at least $3$ people sharing the same birthday in a group of $n$ people

1.2k Views Asked by At

Disclaimer: I've seen posts with good answers for the case of "at least $2$ people", or "exactly $2$ people". Posts with "at least $k$ people" usually suppose that the mutual birthday is a fixed day (i.e. Jan 2nd) and apply the binomial distribution. This post has to do with the probability of $2$ people sharing the same (random) birthday in a group of $n$ people.


Attempt:

The probability of at least $2$ people having the same birthday in a group of $n$ people is the complement of the probability of everyone having a different birthday. That is: $$ p(n,k\geq 2)=1-\frac{365 \cdot 364 \cdot \dots \cdot (365-n+1)}{365^n} $$ Now, let's suppose that we want to find $p(n \geq 3)$. By making the assumption that $2$ people have already the same birthday, we can treat these two as one person. So the probability of at least $2$ people sharing the same birthday in a group of $n-1$ people is: $$ p(n-1,\geq 2)=1-\frac{365 \cdot 364 \cdot \dots \cdot (365-n+2)}{365^{n-1}} $$ Then, I'm thinking of finding the probability $p(n,k=2)$ of exactly $2$ people sharing the same birthday in a group of $n$ people and somehow calculate: $$ p(n,k\geq 3)=p(n-1,k \geq 2 \bigg| n,k=2) $$ but I'm possibly mistaken. Any thoughts?

2

There are 2 best solutions below

2
On BEST ANSWER

As far as I can tell, your methods do not work and cannot be salvaged.

This is a hard problem that cannot be solved by conventional means. To find the probability that at least three people have a common birthday, let us instead compute the probability that no three people share a birthday. This is equal to $$ \frac{n![x^n](1+x+x^2/2)^{365}}{365^n}.\tag 1 $$ Here, $[x^n]f(x)$ is the coefficient of $x^n$ in the polynomial $f(x)$. You want one minus the above. Another way to write this is $$ \sum_{a_1,a_2,\dots,a_{365}}\frac1{365^n}\frac{n!}{a_1!a_2!\cdots a_{365}!}.\tag 2 $$ where the sum ranges over all vectors $(a_1,a_2,\dots,a_{365})$ of integers between $0$ and $2$ whose sum is $n$. This works because each vector specifies a valid distribution of birthdays where no birthday repeats three times or more, and the multinomial coefficient gives the number of ordered selections of birthdays which have that distribution, and each is then multiplied by the probability $(1/365)^n$ of that ordered selection occurring. You can check that when you expand out $(1)$ and collect the coefficient of $x^{n}$, you get exactly $(2)$.


If you are okay with an approximate result, the expected number of triplets of people who share a common birthday is $\lambda =\binom{n}{3}\cdot \frac1{365}^2$, and the number of triplets with a common birthday is approximately Poisson with parameter $\lambda$. Therefore, the probability some triplet share a birthday is approximately $$ 1-e^{-\binom{n}3/365^2}. $$

0
On

While there's no simple formula for the exact expression given in Mike Earnest's equation (2), there are efficient ways of evaluating it for any given $\ n\ $, one of which has been implemented the packagepmultinom of the R statistical programming language. Here's an R program for comparing the exact probabilities of at least three people having a common birthday with the Poisson approximation for $ n=90\ $ to $\ 100\ $:

library(pmultinom)
us<-1:365
ps<-1:365
for (i in 1:365){
  us[i]<-2
  ps[i]<-365^(-1)
}
for(n in 90:100){
  a = 1-pmultinom(upper=us,size=n,probs=ps,method="exact")
  b=1-exp(-n*(n-1)*(n-2)/(6*365^2))
  v<-c(n,a,b)
  print(v)
}

Running it online here produces the following results: $$ \begin{matrix} n &\mbox{exact} & \mbox{Poisson approx.}\\ 90 & 0.5341956 & 0.5859698\\ 91 & 0.5456984 & 0.5982312\\ 92 & 0.5571482 & 0.6103927\\ 93 & 0.5685366 & 0.6224440\\ 94 & 0.5798554 & 0.6343752\\ 95 & 0.5910964 & 0.6461763\\ 96 & 0.6022517 & 0.6578381\\ 97 & 0.6133135 & 0.6693514\\ 98 & 0.6242745 & 0.6807075\\ 99 & 0.6351272 & 0.6918979\\ 100 & 0.6458645 & 0.7029148 \end{matrix} $$ So for this range of values the Poisson approximation is about $10\%$ too latge.