My paradox in determining the probability of distributing similar balls into different boxes

62 Views Asked by At

Inspired by the discussion between @Matthew Pilling and me in this post What is the probability that there more rabbits than chickens in each of these three cages?, I am trying to find and understand the answer to a simple problem I am thinking for a while. (Feel free to read the comments under my answer in that post if you are interested and want to know more.)

Let's say I have two identical balls and I want to distribute them into two different boxes (box1 and box2). What is the probability of having one ball in each box?

Approach 1: We can talk about the total number of solutions for $n_{1} + n_{2} = 2$, where $n_{1},n_{2} \in W $. And, then, the number of solutions to $n_{1} + n_{2} = 2$, where $n_{1},n_{2} \ge 1$. The probability will be calculated as: $\frac{1}{3}$.

Approach 2: Or, we can think of this problem as follows: we have two boxes. The probability of choosing the first box is $\frac{1}{2}$. So, there is $\frac{1}{4}$ probability of choosing the first box for both balls. Same reasoning is true for choosing the second box for both balls. So, in total, there is $\frac{1}{2}$ probability to have both balls in only one box. And, thus, the probability of having only one ball in each box is $\frac{1}{2}$.

Could someone please help me understand what's happening here? Are we dealing with two different problems (in these two approaches) or one of the approaches is wrong because ... ?

Thanks for any input you may provide.

Nima

3

There are 3 best solutions below

1
On

The probabilities of each solution in Approach 1 are not equal. That's like saying "When I roll two six-sided die, the probability of rolling a sum of $2$ is $1/11$ because there are $11$ options: $2$ through $12$."

3
On

The probability that you get balls in unique boxes is $1\cdot \frac 1 2=\frac 1 2$. It doesn't matter what box the first ball goes into, and then the second ball must go into the box the first ball did not go into.

3
On

Your problem is a very common one. Your approach 1 is a combinatorial approach, you count solutions to a discrete (integer) problem. Then you use the classical approach that probability equals "good cases" divided by "all cases".

However, this is only true if you a priori know that each one of "all cases" is as likely to come up as any other. In classical "throw a dice" problem, that assumption is coded into the statment that the involved dice are fair.

In your approach 1, you consider the 3 solutions to the problem $n_1+n_2=2, n_1, n_2 \in \mathbb Z, n_1,n_2 \ge 0.$

You are incorectly thinking that each of those 3 cases is equally likely to come up. It isn't, because when you use approach 2 you see that the actual process you distribute balls is:

  1. Put ball #1 into a box, each with probability $\frac12$, then
  2. Put ball #2 into a box each with probability $\frac12$.

Your approach 2 directly shows that $n_1=0, n_2=2$ and $n_1=2, n_2=0$ both have probabability $\frac14$, while $n_1=1, n_2=1$ has probability $\frac12$.

The main problem is that incorrect believe that just because you can model the result of putting balls into boxes by the three pairs $(0,2), (1,1), (2,0)$, those pair are equally likely.

In a similar case, you the sum of any 2 independent, fair standard 6 sided dice is a number from $2$ to $12$. Those 11 possible values are not equally distributed either, they are centered around the middle, with sum=$7$ having probaility $\frac16$, while $2$ and $12$ have the much lower probability $\frac1{36}$ each.

Again, your main error is that assuming that some repreasentation if the result of the choices is going to be one where each choice is as likely to come up as any other.