I am reading the following problem:
If $20$ people are selected at random, find the probability that at least $2$ of them have the same birthday. As a follow up, how large a group is needed to give a chance of $0.5$ of at least $2$ people having the same birthday?
My solution for the first part:
The probability that none have the same birthday is:
$$\frac{365\cdot 364\cdot 363\cdot \cdot \cdot 346}{365^{20}}$$
Originally I thought that since the numerator is just $$\frac{365!}{345!}$$ I could calculate that, but all calculators I tried gave me an error so I actually typed all the $20$ numbers in the numerator to do the product and then calculate my final solution which is:
$$1 - \frac{365\cdot 364\cdot 363\cdot \cdot \cdot 346}{356^{20}} = 0.411$$
The solution mentions the same number so it is correct.
But thinking about the follow up question, I am thinking if the brute force approach I used is naive and there is some better way to calculate it. Because for the second part I would need to do some small binary search i.e. starting from over $20$ and doing the products in the calculator until eventually narrowing it down to how many numbers would give $0.5$.
Is there a more efficient/smarter way to calculate the probability and the group number besides actually doing all the multiplications?
I will generalize the question in two ways: first for an arbitrary year length of $d$ days (not necessarily $365$) and second for $m$ persons (not necessarily just two) with the same birthday.
Let us gather together a group of $n$ persons. As starting assumptions, let there be $d$ days in a year ($d = 365$ for the standard birthday problem), and suppose the birthday of each person in the group is uniformly distributed over those $d$ days independently of the birthdays of the other $n-1$ persons.
There are at least two ways to approximate the birthday problem calculation by Poisson distributions.
One way to approximate the probability of matching birthdays is to consider each pair of persons as a possible match and compute the expected number of matching pairs. There are $\binom n2$ pairs of persons in the group. A pair of persons has a $\frac1d$ probability to have the same birthday ($d^2$ possible combinations of birthdays, all equally likely by assumption, among which $d$ are matching dates). The expected number of pairs with the same birthday is therefore $$ \lambda_0 = \frac1d \binom n2. $$ (If there are more than two persons born on a particular day of the year, this method counts all pairs of persons in that group as a pair with matching birthdays).
If the number of matching pairs had a Poisson distribution $X_0$ then the expected number of pairs $\lambda_0$ would be the parameter of that distribution. The chance that there are no matching pairs would therefore be the probability $$ P(X_0 = 0) = e^{-\lambda_0} = e^{-n(n-1)/(2d)}. $$ The number of matching pairs does not have a Poisson distribution but it's close enough for a decent approximation. Note that this approximation is the same as $$ e^{-1/d} e^{-2/d} e^{-3/d} \cdots e^{-(n-1)/d} $$ where each term is the approximation $e^{-k/d} \approx 1 - \frac kd,$ so the product approximates \begin{align} 1 \left(1 - \frac1d\right)\left(1 - \frac2d\right)\cdots\left(1 - \frac{n-1}d\right) &= \frac dd \cdot \frac{d-1}{d} \cdot \frac{d-2}{d} \cdot \cdots \cdot \frac{d-(n-1)}{d} \\ &= \frac{d(d-1)(d-2)\cdots (d-(n-1))}{d^n}. \end{align}
Another approach is to consider each day of the year as a possible shared birthday and compute the expected number of shared birthdays. The expected number of persons born on any particular day of the year is $$\lambda_1 = \frac nd.$$ If the number of persons born on a particular date were a Poisson variable $Y_i$ ($1 \leq i \leq d$) with expected value $\lambda_1$ and the variables $Y_i$ were all independently distributed (neither of these things is true, but this is an approximation), the probability of no two persons born on a particular date would be $$ P(Y_i = 0) + P(Y_i = 1) = e^{-\lambda_1} + \lambda_1 e^{-\lambda_1} = (1 + \lambda_1) e^{-\lambda_1}. $$ and the probability of no matching birthdays for the entire year would be the probability that this happened on every day, $$ \left((1 + \lambda_1) e^{-\lambda_1}\right)^d = (1 + \lambda_1)^d e^{-d\lambda_1} = \left(1 + \frac nd\right)^d e^{-n}. $$
Both methods can be generalized for the probability that there is a subset of at least $m$ persons among the $n$ who all share the same birthday, for a given $m \geq 2.$
For the first method, there are $\binom nm$ subsets of $m$ persons and each subset has a $\left(\frac 1d\right)^{m-1}$ probability of all having the same birthday ($d^m$ possible combinations of birthdays of which $d$ are matching). So we simply set $\lambda_0$ to $$ \lambda_0 = \frac{1}{d^{m-1}} \binom nm $$ and again the probability that we do not have a matching set is $e^{-\lambda_0}.$
For the second method, the expected number of persons born on any particular day of the year is still $\lambda_1 = \frac nd$, but the probability of no $m$ persons born on that day is \begin{multline} P(Y_i = 0) + P(Y_i = 1) + P(Y_i = 2) + \cdots + P(Y_i = m-1) \\ = \left(1 + \frac{\lambda_1}{1!} + \frac{\lambda_1^2}{2!} + \cdots + \frac{\lambda_1^{m-1}}{(m-1)!} \right) e^{-\lambda_1} \end{multline} and therefore the probability of no matching set on any of the $d$ days is $$ \left(1 + \frac{n}{1!\,d} + \frac{n^2}{2!\,d^2} + \cdots + \frac{n^{m-1}}{(m-1)!\,d^{m-1}} \right)^d e^{-n}. $$
I have used this last approximation in a birthday-probability calculator in order to estimate the number of persons necessary to get a certain probability of a subset of $m$ matching birthdays (because for $m>2$ the exact formula is not nearly as nice as for $m=2$ and it is not so easy to simply increase $n$ by $1$) and it usually got me close enough to the correct result. ("Close enough" meant that I could use this result to guess the required number of persons for the desired probability and do only two or three exact calculations to find the actual result.)