Calculate the probability that red balls have a majority in at least one of the buckets

199 Views Asked by At

There are total of $n$ balls, $r$ of which are red and the rest are black. When they are divided equally and randomly among $k$ buckets, I'm trying to find the probability that at least one bucket has a majority of red balls, assuming that $r$ is such that it is possible to get a majority in a bucket, i.e. $r > \frac{n}{2k}$.

Example: There are $100$ balls, $7$ of which are red and the rest are black. When they are divided into $10$ equal buckets, I'm trying to calculate the probability that at least one bucket have a MAJORITY of red balls, i.e. at least one bucket should have minimum of $6$ red balls.

I have tried different ways of breaking down the problem into combinations and probability problem. I tried calculating the combination the red balls can be arranged and then calculating the probability, but I'm not sure if I am going in the right direction. Any help is appreciated.

2

There are 2 best solutions below

0
On

What about considering one bucket as a binomial distribution, so that for the red balls $R - Bin(s, r, \frac{1}{k})$ for $s\in \{0,..,r\}$ and black balls $B - Bin (t, n-r, \frac{1}{k})$ for $t \in \{0,..,n-r\}$

The the probability that the number of red balls is greater than the number of black balls for given $t$ is $P(s>t)=\sum_{s=t+1}^{r} \; ^rC_s \; \left( \frac{1}{k} \right)^s \left(1-\frac{1}{k}\right)^{r-s}$

And the complete probability measured over all t is:

$P(red > black) = {\sum_{t=0}^{n-r}} \; ^{n-r}C_t \; \left( \frac{1}{k} \right)^t \left(1-\frac{1}{k}\right)^{n-r-t} P(s>t) $

Sorry I don't know how to extend this multinomially to all buckets at the mom.

0
On

This answer assumes that balls of the same color are indistinguishable, and that buckets are distinguishable.

Let $n=ak$, so that each bucket will receive $a$ balls, and $r>a/2$.

Each distribution of balls among the $k$ buckets corresponds to a solution of $r_1+r_2+\dots+r_k = r$ with $0\leqslant r_i \leqslant a$ for all $i$. An involved application of stars and bars and inclusion-exclusion can provide the number $S$ of such solutions, but I find it much simpler to use generating functions. Indeed, we'll have that

\begin{align}S &= \left[x^r\right]{\left(1+x+x^2+\dots+x^a\right)}^k \\&= \left[x^r\right]{\left(\frac{1-x^{a+1}}{1-x}\right)}^k \end{align}

Now, to find the probability $p$ we're interested in, we need only calculate the number $R$ of cases for which some bucket $i$ has $r_i>a/2$. Then, $p=R/S$. For this, we will unfortunately resort to inclusion-exclusion.

Let $A_i$ be the set of ball-bucket distributions for which $r_i > a_2$. Then, by inclusion exclusion:

\begin{align} \left |\bigcup_{i=1}^k\,A_i\right| &= \sum_{j=1}^k\,(-1)^{j+1}\cdot\left(\sum_{1\leqslant i_1\leqslant \dots \leqslant i_j\leqslant k}\,\left|A_{i_1}\cap\dots\cap A_{i_j}\right|\right) \\&= \sum_{j=1}^k\,(-1)^{j+1}\cdot\binom{k}j\,\alpha_j \tag{$*$} \end{align}

where $\alpha_j = \left|A_{i_1}\cap\dots\cap A_{i_j}\right|$ and the equality in $(*)$ is justified by symmetry. Thus, we need only calculate the coefficients $\alpha_j$.

That would be the number of solutions to $r_1+r_2+\dots+r_k = r$ with $a/2 <r_1,\dots, r_j \leqslant a$ and $0\leqslant r_{j+1}, \dots, r_k \leqslant a$. Once again, we use generating functions. We'll have

\begin{align}\alpha_j &= \left[x^r\right] {\left(x^{\left\lceil a/2\right\rceil}+x^{\left\lceil a/2\right\rceil + 1}+\dots+x^a\right)}^j \cdot{\left(1+x+x^2+\dots+x^a\right)}^{k-j} \\&= \left[x^r\right] x^{\,j\cdot\left\lceil a/2\right\rceil}{\left(1+x+\dots+x^{a-\left\lceil a/2\right\rceil}\right)}^j \cdot{\left(\frac{1-x^{a+1}}{1-x}\right)}^{k-j} \\&= \left[x^{r-j\cdot\left\lceil a/2\right\rceil}\right] {\left(\frac{1-x^{\left\lfloor a/2\right\rfloor +1}}{1-x}\right)}^j \cdot{\left(\frac{1-x^{a+1}}{1-x}\right)}^{k-j} \\&= \left[x^{r-j\cdot\left\lceil a/2\right\rceil}\right] \frac{{\left(1-x^{\left\lfloor a/2\right\rfloor +1}\right)}^j\,{\left(1-x^{a+1}\right)}^{k-j}}{{(1-x)}^k} \end{align}

Of course, if $j\cdot\left\lceil a/2\right\rceil > r$, then $\alpha_j = 0$. We need hence only consider $j\leqslant \big\lfloor r/\left\lceil a/2\right\rceil\big\rfloor $. This gives us a somewhat ugly but nonetheless functional formula for $p$:

$$ p= \frac{ \displaystyle\sum_{j=1}^{\big\lfloor r/\left\lceil a/2\right\rceil\big\rfloor}\, (-1)^{j+1}\cdot\binom{k}j\,\left[x^{r-j\cdot\left\lceil a/2\right\rceil}\right] \frac{{\left(1-x^{\left\lfloor a/2\right\rfloor +1}\right)}^j\,{\left(1-x^{a+1}\right)}^{k-j}}{{(1-x)}^k} }{ \displaystyle\left[x^r\right]{\left(\frac{1-x^{a+1}}{1-x}\right)}^k } $$


Plugging the values of $n=100$, $r=7$ and $k=10$, so that $a=10$, we get $\left\lceil a/2\right\rceil = \left\lfloor a/2\right\rfloor = 5$ and $\big\lfloor r/\left\lceil a/2\right\rceil\big\rfloor = \lfloor 7/5 \rfloor =1$. It follows that $j=1$ is the only index in the summation and hence

\begin{align} p&= \frac{\displaystyle 10\cdot\left[x^2\right] \frac{{\left(1-x^{6}\right)}\,{\left(1-x^{11}\right)}^{9}}{{(1-x)}^{10}} }{ \displaystyle\left[x^{7}\right]{\left(\frac{1-x^{11}}{1-x}\right)}^{10} } \\&= \frac{\displaystyle 10\cdot 55 }{ 11440 }\simeq 4.81\% \end{align}