Why does the order matter when I throw differently colored balls into bins? (with limited bin capacity)

198 Views Asked by At

I have $n$ bins, $n$ red balls, and $n$ green balls. I throw the balls into randomly selected bins one at a time. Once a bin has $2$ balls in it, I close the bin. It matters to me how many bins will have two red balls at the end.

Does the order I throw the balls matter?

The answer is yes, which I've verified through simulation and looking at the $n=2$ case, but I'm wondering whether there is a powerful conceptual explanation. This could take the form of calculating the expected number of red/red bins for any $n$ in ordered vs randomized sets of balls and noting the difference in the process, or it could take some other form.

2

There are 2 best solutions below

3
On

The $n=2$ case is in itself a perfectly cromulent argument, of course, but we can also see there is a difference for large $n$ by looking at the probability of getting no red/red bins at all.

If we're throwing all the red balls first, then each of them needs to land in a bin that hasn't been hit yet, so $$ P_{\text{reds first}}(\text{no red/red}) = \frac{n}{n}\frac{n-1}{n}\cdots\frac{1}{n} = \frac{n!}{n^n} $$

On the other hand, if we're throwing balls in a random order, then we might as well start by throwing $2n$ numbered balls in order (closing bins along the way) and only after we know which number goes where decide which of the numbers are red and green. This means that we end up with the same distribution as if we start by picking two balls for the first bin, then two balls for the next bin, and so forth. To avoid having any red/red bins the second ball we choose for each bin must be of the opposite color from the first, so $$ P_{\text{random order}}(\text{no red/red}) = \frac{n}{2n-1}\frac{n-1}{2n-3}\frac{n-2}{2n-5}\cdots\frac{1}{1} = \frac{n!}{(2n)!/(2^n n!)} = \frac{2^n(n!)^2}{(2n)!} $$

Using Stirling's approximation, the logarithms of these go as (ignoring terms of order $O(\log n)$): $$ \log\frac{n!}{n^n} \approx n \log n - n - n\log n = -n $$ $$ \log \frac{2^n(n!)^2}{(2n)!} \approx n\log 2 + 2n\log n - 2n - 2n \log 2n + 2n = -n\log 2 $$ Since $\log2 < 1$, the probability for no red/red in the "reds first" model decreases strictly faster than in the "random order" model.


This doesn't mean that the expected number of red/red bins for the two models can't have the same behavior, but at least if you're after the actual probability distribution on the number of red/red bins, then the order matters, also asymptotically.

0
On

A simpler way to see that the order of balls matters in a stronger sense than Henning Makholm's answer is the following.

Start by throwing $n-1$ red balls into bins. Something happens: $a$ bins have two red balls, $b$ bins have exactly one, and $c$ bins are empty. We'll condition on the values of $a,b,c$. Notice that $c\geq 1$. If $b=0$ then what happens next can't affect the outcome, so assume $b\geq 1$ (for any $n\geq2$, this has positive probability).

Now consider two cases. If the next ball is red, the chance it completes a red/red bin is $\frac{b}{b+c}$. If the next ball is green and the ball after that is red, the chance that the red ball completes a red/red bin is $$\frac{b}{b+c}\times\frac{b-1}{b+c-1}+\frac{c}{b+c}\times\frac{b}{b+c}=\frac{b}{b+c}\times\left(\frac{c}{b+c}+\frac{b-1}{b+c-1}\right).$$ Note that $$\frac{c}{b+c}+\frac{b-1}{b+c-1}=\frac{c(b+c-1)+(b-1)(b+c)}{(b+c)(b+c-1)}=1-\frac{c}{(b+c)(b+c-1)}<1.$$ Therefore the probability of completing a red/red bin with the final red ball, given what happened so far, is strictly less if it is in position $n+1$ than position $n$, for all situations where this event is possible.

It follows by averaging that the orderings R...RGRG...G and R...RRGG...G give different expected numbers of red/red bins for every $n\geq2$.