There are $N$ balls of $M$ different colors in a box i.e $c_1$ balls of color $1$ and so on. $c_1 + c_2 + \dots + c_M=N$, $c_1, c_2, \dots, c_M$ are known. We are looking for a ball of a particular color, say, $j$.The balls are picked from the box randomly, and if the color of the ball picked up say $i$ is such that, $i \ne j$, all the balls of color $i$ disappear. How many times do I need to pick to get a ball of color $j$ on average?
2026-04-05 06:41:01.1775371261
On
N balls having M different colors in a box, how many times do I need to pick to get one particular color?
411 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
1
On
The closed form seems hard. A computational alternative is:
- Let $P(x,j)$ be the probability that we pick a ball of $j$ th kind for the first time in $x+1$ th picking.
- Let $S_j$ be the set of solutions of $x_1+\dots +x_{j-1} +x_{j+1}+\dots+x_m=x$ with the restriction that $0\le x_i \le c_i \ \forall \, 1\le i \le m \, , \, i\ne j$ (Can be generated by a recursive program.)
- Let $z_{j,l}:=\langle y_1,\dots,y_{j-1},y_{j+1},\dots,y_m \rangle \in S_j$. Then, the probability that $y_1$ balls of type-1, $y_2$ balls of type-2 and so on were selected out of a total of $x$ balls is $$P(x,j,l):=\binom{x}{y_1,\dots,y_{j-1},y_{j+1},\dots,y_m}\frac{\prod_{i=1;i\ne j}^{M}(c_i)(c_i-1)\cdots (c_i-y_i+1)}{N(N-1)\cdots (N-x-1)}$$ Note that we are taking into account the order in which the balls are picked. Elements are $S_j$ are indexed by $l$. Make sure that the recursive program gives output in some canonical order.
- Hence, $P(x,j)=\sum_{l=1}^{|S_j|}P(x,j,l)\cdot \frac{c_j}{N-x}$ and the expected value is $\sum_{x=1}^{N}x \cdot P(x,j)$
I don't know that there is a simple closed formula for this, but there is a fairly simple way to compute it recursively.
Let's say we are looking for color $1$, just to be specific. For any data $\{c_i\}_{1}^m$, let E[{c_i}] be the expected number of turns required to find color 1. We see that:
$$E[\{c_i\}]=\frac{c_1}{\sum {c_i}}\,1+\sum_{j=2}^{j=m}\frac{c_j}{\sum {c_i}}(E[\{c_1,c_2, ...,\hat {c_j},...c_m\}]+1)$$ Where, as usual, the $\hat {c_j}$ means that this term is deleted from the data.
this is very easy to automate. Perhaps it can be solved in closed form, though the shifting probabilities make that look a little daunting.