Expected number of colors picked from multiple baskets

254 Views Asked by At

Suppose there are $K$ independent boxes, each of which has the same number of different colored balls, denoted as $N$. For the first case, we assume the colors of the balls in the $K$ boxes are the same. For the second case, we assume that the balls in the $K$ boxes are of different colors. For an example, the possible colors of balls are $\{\text{ red} ,\text{blue} ,\text{green} \}$. Assume that each box is of size $2$. Then, for the first case, each box is of balls $\{\text{ red} ,\text{blue} \}$. They are symmetric. For the second case, an example is $\{\text{red} ,\text{blue} \},\{\text{red} ,\text{green} \} $ and $ \{\text{blue} , \text{green} \}$. There is no necessity for the color permutation. But they are asymmetric.

Consider the experiment where each time we pick a single ball from each box randomly and get $K$ colored balls totally. Then, we would like to compare the expected number of colors for the $K$ colored balls, denoted as $\{B_1,B_2\}$, picked independently from the $K$ boxes under the above mentioned two cases. Intuitively, $B_1 \le B_2$. I have demonstrated it when $K = 2$. However, when $K$ goes larger, the case becomes more complicated. I am still confused by how to prove it theoretically.

Any references or approximations would be appreciated. Thanks a lot!

1

There are 1 best solutions below

4
On

In the first case, the expected number of balls of each colour is $K/N$ where $N$ is both the number of balls in each box and the number of different colours. In the 2nd case, the expected number of balls of each colour is $K/N$ where $N$ is the number of different balls, provided they permutate equally frequently as shown in your example. Assuming the number of colours in case 2 must exceed the number of boxes, as implied, then your inequality must be correct.

This is because as follows:

Let $K$ be the number of boxes. Let $N$ be the number of balls in each box. Let $b_m\in B$ be the list of counts of each distinct colour in the box, such that:$$\sum\limits_{b\in B} b_m=N$$ be the number of distinct colours.

Therefore the first case is the special case of the 2nd case in which $b_m=1\forall b\in B$.

In a single draw from a single box, probability colour $m$ is chosen, is $b_m/N$.

In the first case, $b_m=1\forall b\in B$: For one box, so the probability of choosing colour $m$ from a boxis $1/N$

Probability of success or failure is the same for every box so this is a binomial distribution for every colour. In a binomial, $E(X)=np$. We are drawing one from each box so the number of trials is $K$, and probability is $1/N$ so so the expected number is $E(X)=K/N$

Moving on to the 2nd part of the question, when there may be different numbers of each colour in each box. Here we may permit any $b_m\geq 0$, and also potentially different sets $N_p$ for every box. However in the example you give, every permutation of two choices selected from {red, blue, green} is shown. This has he effect that the expected value of each must be equal. Therefore we have:$$\sum\limits_mE(X_m)=E(\sum\limits_m X_m)$$

Therefore for every colour $m$: $$E(X_m)=K/N$$

Where this time, if we use your notation $B$ to replace $N$, is the total number of colours permutated.

Provided you do not permit fewer colours than boxes, which again is implied but not stated in the question, $B\geq N \implies K/B\leq K/N$ so you are correct in your assertion which you stated using $B_n$ that $B_1\leq B_2$.