I would like a better understanding of the famous birthday paradox. "What is the probability that, in a set of n randomly chosen people, some pair of them will have the same birthday?"
I understood the first part, where the probability reaches 100% when the number of people reaches 367 by the pigeonhole principle. But I am not understanding the explanation beyond that. How do they say that the probability is 99.9% with 70 people and 50% with 23 people? And how do you further generalize the answer? And why is it a "paradox"?
Let the number of people in the group be $n$.
The probability that a pair of people don't share a birthday is given equal to $\frac{364}{365}$ ignoring leap years.
There are $\binom{n}{2}$ pair of people in a group of $n$ people. No pair of people will share a birthday if each person has a distinct birthday. The probability of this happening is given by
$$\frac{364}{365}\times\frac{363}{365} \dots \times \frac{365 - (n-1)}{365}$$
How did I get this probability?
Assume that all birthdays are equally likely. If the first person was born on day $x_1$ then the second person in the group cannot be born on day $x_1$. The probability for this happening is $364\over 365$. Now let the birthday of the second person be $x_2$. The probability that the third person is not born on $x_1$ nor on $x_2$ is $363\over365$. Similarly we get the probability for the $n^{\text{th}}$ person. Since each event is independent, we multiply all the probabilities.
Thus the probability that at least one pair shares a birthday for a group of $n$ people is given by
$$p = 1 - \left(\frac{364}{365}\times\frac{363}{365} \dots \times \frac{365 - (n-1)}{365}\right)$$
Now you have the probability $p$ as a function of $n$. If you know the RHS, then you simply find for what value of $n$ we get the closest RHS to $p$
It so happens that if $p = 99.9\%$, the $n = 70$