Full Question: The pool of 100 balls contains 20 pink balls, 10 yellow balls, 20 purple balls, 15 indigo balls and 35 green balls. We randomly pull 10 balls out of 100 and randomly distribute them into boxes, each box containing 2 balls. What is the probability that no boxes will have balls of the same color?
This is a probability problem I made up from my high school science fair, only to quickly realise I’m not sure myself how to calculate these probabilities.
What I’ve done:
I realised that if we calculate the probability of any two (at least two) balls of the same colours being next to each other in the sequence and then divide it by two (accounting for the ‘what if the box divider is between them’) and subtract it from 1 we will get the correct answer. Is that reasonable?
But I struggle to calculate that simplified problem. I would love some guidance / suggestions for how to proceed with it. Straight away solutions are also welcomed.
PS: It is worth mentioning that I would love to also solve this problem if the boxes fit x balls inside, and what if balls are distributed randomly, meaning that all boxes have different amount of balls in them.
Let us coin the box containing two balls of same color as monochromatic. Otherwise the box is bichromatic.
By inclusion-exclusion principle the number of selections of 5 bichromatic boxes is: $$ N=\sum_{i=0}^5 (-1)^k\binom5kN_k,\tag1 $$ where $N_k$ is the number of selections with at least $k$ monochromatic boxes, so that the problem boils down to determining $N_k$.
$N_k$ can be determined as follows. Let $$ f_i(x)=\sum_{j=0}^5 \binom{n_i}{\underbrace{2,\dots,2}_j,n_i-2j}\frac{x^j}{j!}= \sum_{j=0}^5\frac{n_i!}{2^j(n_i-2j)!}\frac{x^j}{j!}\tag2 $$ with $(n_1,n_2,n_3,n_4,n_5)=(20,10,20,15,35)$, and $$ F(x)=\prod_{i=1}^5 f_i(x).\tag3 $$ Then $$ N_k=\binom{100-2k}{10-2k}\frac{(10-2k)!}{2^{5-k}}k![x^k]F(x),\tag4 $$ where $[x^k]$ is the so-called coefficient extraction operator.
In $(4)$ the the factor $k![x^k]F(x)$ counts the number of ways to construct $k$ monochromatic boxes. The resulting factors $\frac{k!}{j_1!j_2!j_3!j_4!j_5!}$ (with $\sum_ij_i=k$) represent the number of ways to assign "colors" to $k$ monochromatic boxes ($j_i$ being the number of boxes containing the balls of the $i$-th color). The factor $\binom{100-2k}{10-2k}\frac{(10-2k)!}{2^{5-k}}$ stays for the number of ways to choose the remaining $10-2k$ balls and distribute them among the remaining $5-k$ boxes.
Plugging $(4)$ into $(1)$ one obtains $N$ and finally computes the probability as $$ P=\frac{N}{N_0}. $$
My calculation yields $P\approx0.274$, which matches well with the numerical simulation.