Problem regarding coloring balls drawn from a bin

461 Views Asked by At

I'm not sure if this has been asked before, but here I have the following problem:

There are $n$ indistinguishable red balls in a bin. Each "round," $k$ balls are randomly chosen from the bin with equal probability. All $k$ balls are colored blue and put back into the bin (this means that balls that are already blue are not changed). What is the expected number of rounds needed to color all $n$ balls blue?

Is there an explicit formula for this problem? I've thought about this problem for a bit and quickly got stumped, but I realized that unless there was some insight for the problem, a recursive formula of some sort is probably needed as each round relies on the results of the previous rounds. If the formula turns out to be recursive, is there some relatively "simple" heuristic that can estimate the expected number of rounds?

There's also the tricky case of infinitely drawing blue balls after initially drawing some red balls, which would result in an infinite number of rounds. But my gut feeling says that the probability of this happening are small enough that the expected number of rounds should be finite.

After searching a bit online, this seems to be (correct me if I'm wrong) the Coupon collector's problem but multiple coupons are drawn at once instead of just one at a time, and that each of the coupons drawn are all distinct. If that's the case, can any of the insights from that problem be used in the problem I have currently?

3

There are 3 best solutions below

0
On BEST ANSWER

It can be calculated as

$${\bf e_1}^T \left(I_n - \left({\frac{ {{n-i \choose j-i}} {i \choose k-j+i} }{n\choose k}} \right)_{i=0,j=0}^{n-1} \right)^{-1} {\bf 1}$$

Here $I_n$ is the $n \times n$ identity matrix, $\bf e_1$ is the first standard basis vector of $\mathbb{R}^n$ and ${\bf1}$ is the $n$-vector of all ones. Notice that the product with the vectors actually just calculates the sum of the first row of the matrix, and also that $i$ and $j$ go from $0$ to $n-1$.

We get the following values for $n=1\dots 10$ (listed for values of $k=1\dots n$):

n =  1 :  [1]
n =  2 :  [3, 1]
n =  3 :  [11/2, 5/2, 1]
n =  4 :  [25/3, 19/5, 7/3, 1]
n =  5 :  [137/12, 671/126, 29/9, 9/4, 1]
n =  6 :  [147/10, 97/14, 327/76, 41/14, 11/5, 1]
n =  7 :  [363/20, 5693/660, 85691/15810, 257/68, 11/4, 13/6, 1]
n =  8 :  [761/35, 1003423/96525, 974651/148005, 45949/9867, 951/275, 71/27, 15/7, 1]
n =  9 :  [7129/280, 122429/10010, 3352087/429940, 161699429/29100500, 524/125, 5915/1826, 89/35, 17/8, 1]
n =  10 :  [7381/252, 961349/68068, 8241679/911064, 1445423/222794, 3909721/792407, 916379/236379, 733/238, 109/44, 19/9, 1]
8
On

For $n = 3$ and $k = 1$ ... the story unfolds thus ...

Round 1: 1 red ball, colored blue, put back in the bag

Round 2: Say you did pick a red ball. You color it blue and put it back in the bin.

Round 3: Say you picked the last red ball (you have the Devil's luck). You color it blue and ... we're done!

That is, the minimum number of rounds = $3$, for $n = 3$ and $k = 1$

What is the probability, call it P(C), of this happening?

$P(C) = 1 \times \frac{2}{3} \times \frac{1}{3} = \frac{2}{9}$

The maximum number of rounds would correspond to the following probabilities:

  1. $1 \times \frac{1}{3} \times \frac{1}{3} \times ...$ (picking the same ball colored blue every round after the first)

  2. $1 \times \frac{2}{3} \times \frac{2}{3} \times ...$ (picking the same balls colored blue every round after the second)


From the above we can see that for $n$ balls, if we pick $k$ balls at a time (assuming you're insanely lucky),

Round 1, $k$ balls colored blue: $n - k$ balls left
Round 2, $k$ balls colored: $n - 2k$ balls left
Round $m$, Another $k$ balls colored: $n - mk$ balls left

The rounds end when $0$ balls are left $n - mk = 0$
$mk = n $
$m= \frac{n}{k}$

The minimum number of rounds ($m$) = $\frac{n}{k}$. If the answer is a mixed fraction, round to the nearest, greater integer.

The probability of this happening = $1 \times \frac{n - k}{n} \times \frac{n - 2k}{n} \times ... \times \frac{n - (m - 1)k}{n}$


EDIT 1 START

For 3 balls ...

The probability of either of the 2 scenarios where you fail to color all the balls blue = $P(F)$

$P(F) = \left( 1 \times \frac{1}{3} \times \frac{1}{3} \times ...\right) + \left( 1 \times \frac{2}{3} \times \frac{2}{3} \times ... \right) = \left( \frac{1}{3} \right)^n + \left( \frac{2}{3}\right)^n$

Calculate $P(F)$ such that $P(F) \leq 5\%$ i.e. $\left( \frac{1}{3}\right)^n + \left(\frac{2}{3}\right)^n \leq 5\%$.

Then, the probability of success = $1 - P(F)$ (???)

Generalize for $n$ balls.
EDIT 1 END


EDIT 2 START

X = The number of rounds to color all the balls

$E(X) = p_1 \cdot x_1 + p_2 \cdot x_2 + ... + p_n \cdot x_n$

where $p_n$ is the probability that $x_n$ rounds are required to color all the balls.

I have calculated, for 3 balls ($n$ = 3), picking 1 ball at a time ($k$ = 1), the probability that 3 rounds will be required ($\frac{2}{9}$).

The probability that 4 rounds will be required = $1 \times \frac{1}{3} \times \frac{2}{3} \times \frac{1}{3} + 1 \times \frac{2}{3} \times \frac{2}{3} \times \frac{1}{3} \times 2 = \frac{10}{27}$

Continuing along this trajectory, we can calculate the theoretical probability of increasing number of rounds that are required to color all the balls blue. As $x_n$ increases, $p_n$ decreases (tends to $0$). We will have to make a call, so to speak, if you must have an average number of rounds. That is exactly what $95\%$ probability is all about.

$E(X) = \frac{2}{9} \cdot 3 + \frac{10}{27} \cdot 4 + ... + p_c \cdot x_c$

$x_c$ is the critical number of rounds needed for success such that $p_c$ is practically $0$. I have chosen $p_c = 0.05 = 5\%$ In other words, in our world, $5\% = 0\%$

3
On

You already have an excellent solution, but here is a solution by a different method that renders the result in a different form.

Define $T$ to be the number of the first draw on which all red balls have been seen; we want to find $E(T)$. We will use the theorem that $$E(T) = \sum_{m=0}^{\infty} P(T > m)$$ Now $T>m$ if and only if at least one ball has never been seen in draws one through $m$; by inclusion/exclusion, this probability is $$P(T > m) = \sum_{j=1}^n (-1)^{j+1} \binom{n}{j} \frac{\binom{n-j}{k}^m}{\binom{n}{k}^m}$$ So $$E(T) = \sum_{m=0}^{\infty} P(T > m) = \sum_{m=0}^{\infty}\sum_{j=1}^n (-1)^{j+1} \binom{n}{j} \frac{\binom{n-j}{k}^m}{\binom{n}{k}^m}$$ Switching the order of summation, $$\begin{align} E(T) &= \sum_{j=1}^n \sum_{m=0}^{\infty} (-1)^{j+1} \binom{n}{j} \frac{\binom{n-j}{k}^m}{\binom{n}{k}^m} \\ &= \sum_{j=1}^n (-1)^{j+1} \binom{n}{j} \sum_{m=0}^{\infty} \frac{\binom{n-j}{k}^m}{\binom{n}{k}^m} \\ &= \sum_{j=1}^n (-1)^{j+1} \binom{n}{j} \frac{1}{1-\binom{n-j}{k}/\binom{n}{k}} \end{align}$$ In the last step we summed a geometric series.