I am working with a question of birthday problem, and here is the case I have to solve:
Ignoring leap days, the days of the year can be numbered 1 to 365. Assume that birthdays
are equally likely to fall on any day of the year. Consider a group of n people, of which
you are not a member. An element of the sample space Ω will be a sequence of n birthdays
(one for each person).
Call A is the event such that: "There is exist some three people in a group that are share a birthday"
Find the formula to represent the probability of A happens, and find the smallest size of n people such that:
P(A) is larger than 50%
And I have found the correct result for the question in this link (page 2 and 3):
https://oeis.org/A225852/a225852.pdf
Basically the answer state that the P(exist 3 people) = 100% - P(not exist at least a group of 3 people share the same Birthday) = 100% - P(0 people) - P(2 people)
However the thing that make me hard to understand is the process of finding the P(exist at least a pair)
The result said that for P(exist at least pair has the same birthday), one should divide the n people group to 2 sub-groups include: 2i people (with i coincident birthday) and n - 2i for the distinct birthday.
Suppose I have a set of coincident birthday like {1,2,3} - denote in this i = 3 , because i have 3 elements in the set of coincident birthday
=> the result stated that in the next step, we can choose 2i people from the group of n -> that fine, but I cannot understand the process of dividing each selection of each people in the set of 2i (coincident)
the result of that process can state as:
$ \frac{1}{i!} . {2i\choose 2} . {2i - 2 \choose 2} . {2i\choose 4} ... . {4\choose 2} = \frac{2i!}{i!. 2^i} $
-> I can understand the meaning of the product of the combination is choosing 2 person for one coincident birthday in i days ( like day d = 1 has ${2i\choose 2}$ ways, day = 2 has ${2i - 2\choose 2}$ ways ,.. so on, untill only 4 people group left).
=> but why the equation need the term: $\frac{1}{i!}$, I think the order of the pair of choosing people from the set of incidence day doesn't affect to the result of sequence of n people, but why the answer must include that (note in last 2 equation in the page 2 of the link)
Also, why we cannot take the product of choosing the people from the set of 2i people (same birthday) . the number of people with distinct case like (n - 2i) group like a result, but we have to multiply each of the i pairs with the remaining n - 2i people with distinct, then we have to sum all of the case of i $ H_{2_i}$ to form the result (And why the upper bound of the result must be $ floor(\frac{n}{2} ) $ (first of the page 3), which create really ugly result
I hope someone can explain for me intuitively the result from this paper. Thanks in advance
This is not an answer. Instead, this is a long-winded comment.
I abhor trying to delve into math literature that is often poorly written. So, rather than trying to explain literature already written, I will document my approach.
First, consider the much easier approach of computing the probability that at least two people share the same birthday. This is typically done by (instead) computing the complementary probability that no two people share the same birthday.
The standard approach is to express the probability through combinatorics, as
$$\frac{N}{D} ~: ~D = (365)^n.$$
Then,
$$N = \frac{(365)!}{[365 - n]!}.$$
Attacking the complementary probability that no three people share the same birthday may also be done combinatorically, via
$$\frac{N}{D} ~: ~D = (365)^n.$$
However, the computation of $~N~$ is much more complicated. To illustrate the complications, suppose that $~n = 10.~$ Then each of the following would constitute a satisfactory distribution of birthdays, where no 3 people, out of the 10 people, share a birthday:
The 10 people each have 10 distinct birthdays.
The 10 people have 9 distinct birthdays, and there is one occurrence of two people sharing a birthday.
The 10 people have 8 distinct birthdays, and there are two separate pairs of people, such that each of the two pairs share a birthday.
The 10 people have 7 distinct birthdays, and there are three separate pairs of people, such that each of the three pairs share a birthday.
The 10 people have 6 distinct birthdays, and there are four separate pairs of people, such that each of the four pairs share a birthday.
The 10 people have 5 distinct birthdays, and there are five separate pairs of people, such that each of the five pairs share a birthday.
In the above bullet points, it is to be understood that when pairs of people are formed, no two different pairs share the same birthday.
So, one approach would be to set
$$N = N_1 + N_2 + N_3 + N_4 + N_5 + N_6.$$
Then, you would let each of the mutually exclusive bullet pointed situations described above be enumerated by $~N_1, \cdots, N_6,~$ respectively.
Naturally, for general values of $~n,~$ (instead of $~n=10~$), you would then take a similar approach that also involves mutually exclusive situations.
Note:
There may well be a more elegant approach that I am overlooking. Further, this (possible) more elegant approach may or may not involve dealing with mutually exclusive cases.
As someone new to the question of whether there are three people who share a birthday, the approach in this answer represents how I would attack the problem. When I am dealing with a problem that seems new to me, I go slow and careful.
Note:
Using my approach, with $~\displaystyle p \in \left\{0,1,2,\cdots,\left\lfloor \frac{n}{2}\right\rfloor \right\},~$ let $~f(n,p)~$ denote the number of ways that $~p~$ pairs can be formed from the $~n~$ people such that each pair of people share a birthday, and the $~n~$ people therefore have birthdays that fall on $~(n-p)~$ distinct days.
Then, the enumeration of $~N~$ will be
$$N = \sum_{p=0}^{\left\lfloor \frac{n}{2}\right\rfloor} f(n,p),$$
so the problem reduces to computing $~f(n,p),~$
To compute $~f(n,p),~$ when $~p \geq 1,~$ note that the number of ways that the $~p~$ pairs can be formed is
$$\frac{1}{p!} \times \left[\binom{n}{2} \times \binom{n-2}{2} \times \cdots \times \binom{n - 2p + 2}{2}\right] \\ = \frac{1}{p!} \times \frac{n!}{2^p \times (n-2p)!}. \tag1 $$
In (1) above, the $~\dfrac{1}{p!}~$ factor is an over-counting adjustment factor that adjusts for the fact that the $~p~$ pairs may be ordered in $~p!~$ different ways.
Then, with $~p~$ pairs formed, you have that $~(n-p)~$ distinct birthdays will be involved. So, you have to append the factor of $~\displaystyle \frac{(365)!}{[365 - (n-p)]!}.$
Thus, for $~p \geq 1,~$
$$f(n,p) = \frac{1}{p!} \times \frac{n!}{2^p \times (n-2p)!} \times \frac{(365)!}{[365 - (n-p)]!}.$$
Note that the above formula also works for $~p = 0.$