Expected number of steps to get N red balls choosing K at a time

606 Views Asked by At

I have a problem that can be reduced to a classic box with red and black balls.

We have a box containing $N$ red balls and $M$ black balls. Taking $K$ balls at a time (without replacement), what is the expected number of tries I need to get all $N$ red balls.

After looking similar questions I see we need to calculate the expected value of the given probability distribution, but can't figure out which one is it.

Let's take a look at an example of what we are looking at:

Assume N=3, M=8 and K=2.

  • In the best case, after 2 tries I'd already found 3 red balls (i.e., 2 at the 1st and 1 at the 2nd, or viceversa).
  • In the worst case, it will take 6 tries to take the red balls (i.e., I'll first find all black balls). There are actually many variations of this worst case where I find some red ball before, but the last at the end.

But, how to define this "average" expected case?

3

There are 3 best solutions below

0
On

Let's take a simple example $N=3$, $M=3$, and $K=2$. The possible draws to choose all 3 red balls in this case are (If I missed any possibility, correct me please)

RR RB = 2 steps with probability $p_1$ (R = red, B = black)

RB RB RB = 3 steps with probability $p_2$

RB RR = 2 steps with probability $p_3$

RB BB RR = 3 steps with probability $p_4$

RR BB RB = 3 steps with probability $p_5$

BB RB RR = 3 steps with probability $p_6$

BB RR RB = 3 steps with probability $p_7$

You need to calculate the probability of each event above. The expected/average number of steps is calculated as in

$$\sum_ip_ix_i$$

where $p_i$ is the probability and $x_i$ is the number of steps for event $i$.

2
On

The simplest case is when $K=1$ in which case the expected number appears to be $\dfrac{N(N+M+1)}{N+1}$, a result related to $\sum\limits_{j=0}^M \frac{M \choose j}{N+M \choose j} = \frac{N+M+1}{N+1}$

There may or may not be quite such a simple expression for general $K$, and having $K$ not needing to divide $N+M$ may complicate the issue, but it should be slightly larger than $\dfrac{N(N+M+1)}{K(N+1)}$

For example, with $N=3$, $M=8$ and $K=1$, I think the expected number of tries is $9$. This suggests that with $K=2$ it should be slightly more than $4.5$; detailed calculation seems to give $2 \times \frac{4}{165}+3 \times \frac{16}{165}+4 \times \frac{36}{165}+5 \times \frac{64}{165}+6 \times \frac{45}{165}=\frac{158}{33} \approx 4.79$. The squares in that cacluation hint at some possibilities worth investigating further

0
On

Let $S_j$, for $N+M \ge j \ge N$, be the number of sequences of $M+N$ red and black balls where $N$ red balls occur within the first $j$ balls and the $j$th ball is red.

$S_j = {j-1 \choose N-1}N!M!$, since there are $j-1 \choose N-1$ unique sequences of $N-1$ positions for red balls in $j-1$ slots, $N!$ permutations of red balls, and $M!$ permutations of black balls.

Then, we define $p_j$ to be the probability that a sequence of $S_j$ occurs.

$$p_j = \frac{S_j}{(N+M)!} = \frac{j-1 \choose N-1}{N+M \choose N}$$ Assuming for now that $K=1$, we find that the expected number of tries, $E_1$, is

$$E_1 = \sum_{t=N}^{N+M} t p_t = \frac{1}{N+M \choose N}\sum_{t=N}^{N+M}t{t-1\choose N-1} = \frac{N}{N+M \choose N}\sum_{t=N}^{N+M}{t\choose N}=N\sum_{k=0}^{M}\frac{P_{N+t,t}}{P_{N+M,M}}$$

For $K > 1$, $E_K$ is just regrouping the probabilities $p_t$ and then multiplying by the number of tries.

$$E_K = \sum_{t=\lceil N/K \rceil}^{\lceil N+M/K \rceil} t \left(\sum_{v=0}^{K-1}p_{Kt+v}\right)$$