Solving Birthday Paradox with Triangular Number formula

105 Views Asked by At

I have an incorrect solution to the classic birthday paradox question, and while plugging in some values shows the formula is wrong, I can't see any intuitive reasoning as to why it is incorrect, or where I make my first mistake. In this derivation $v$ is a vector containing $k$ birthdays.

$$ P(at\ least\ 1\ match\ in\ v) \\ =P(v_1\ in\ v_{2:k} | v_{2:k}\ all\ distinct) + P(at\ least\ 1\ match\ in\ v_{2:k}) \\ =P(v_1\ in\ v_{2:k} | v_{2:k}\ all\ distinct) + P(v_2\ in\ v_{3:k} | v_{3:k}\ all\ distinct) + ... + P(v_{k-1}\ in\ v_{k:k} | v_{k:k}\ all\ distinct) + P(v_k\ in\ \emptyset) \\ =\frac{k-1}{365}+\frac{k-2}{365}+...+\frac{1}{365}+\frac{0}{365} \\ = \frac{k-1+k-2+...+1+0}{365}\ \ (use\ triangular\ number\ formula\ here)\\ = \frac{k(k-1)}{2*365} \\ = \frac{k(k-1)}{730} \\ $$

$P(at\ least\ 1\ match\ in\ v)$ should be 1 when $|v|=366$, but here it reaches 1 when $|v|=27.52$.

https://en.wikipedia.org/wiki/Birthday_problem

https://en.wikipedia.org/wiki/Triangular_number

1

There are 1 best solutions below

0
On BEST ANSWER

It has to do with your incorrect use of conditional probability.

What you're trying to express is that, for a match to happen in $v$, we either have the case that $v_{2:k}$ are all distinct and $v_1$ is in $v_{2:k}$ or $v_{2:k}$ already contains one match.

So, instead of the first term being $\mathbb{P} (v_1 \in v_{2:k} \mid v_{2:k} \text{ all distinct.})$ it should be

$$\mathbb{P} (v_1 \in v_{2:k} \land v_{2:k} \text{ all distinct.}) = \mathbb{P} (v_1 \in v_{2:k} \mid v_{2:k} \text{ all distinct.}) \cdot \mathbb{P} (v_{2:k} \text{ all distinct.}).$$

You then also get the correct probability: As you have correctly calculated (assuming uniform probability for a birthday) $\mathbb{P} (v_1 \in v_{2:k} \mid v_{2:k} \text{ all distinct.}) = \frac{k-1}{365}$. The probability of $\mathbb{P} (v_{2:k} \text{ all distinct.})$ is

$$ \frac{365 \cdot 364 \cdot \ldots \cdot (365 - (k-2)) )}{365^{k-1}} = \frac{365!}{(365 - (k-1))! \cdot 365^{k-1}}.$$

Iterating like this, gives us

$$ \mathbb{P} (v_{1:k} \text{ contains at least one match.}) = \sum_{ j = 1}^k \left( \frac{j-1}{365^j} \cdot \frac{365!}{(365 - (j-1))!} \right).$$

Plugging in $k = 366$ in this formula gives $1$ as expected.