Expected number of successes when sampling N times from a binomial distribution, stopping after M successes?

835 Views Asked by At

I'm trying to figure out the equation for the expected number of successes $E(X)$ when sampling from a binomial distribution with success probability $p$. In this sampling scheme, we either sample $N$ times, or until we observe $M$ successes, whichever comes first.

For example, if $N = 10$ and $M = 3$, the results might look like

# Run 1 Run 2 Run 3
1 T (1 success) F F
2 F F T (1 success)
3 T (2 successes) T (1 success) T (2 successes)
4 F F F
5 F F T (3 successes)
6 F F (stop)
7 T (3 successes) T (2 successes)
8 (stop) F
9 F
10 F
Total Successes 3 2 3

I know that if the sampling scheme were only sampling $N$ times, then $E(X) = N p$. Similarly, if the sampling scheme was sampling until $M$ successes, naturally $E(X) = M$. However I'm having difficulty figuring out what the combination of the two stopping criteria would make $E(X)$.

1

There are 1 best solutions below

0
On

If $M \geq N$, then the experiment will never reach $M$ successes before there are $N$ trials, so the expected value would be $N p$. The case where $M < N$ is less trivial.

Assuming $M < N$, you can separate the probabilities in the sum for the expected value $$E(X) = \sum_{k = 0}^M k P(\text{there are $k$ successes})$$ based on whether the experiment reaches $N$ trials or not.

If the experiment reaches $N$ trials, then the probability of $k$ successes is the usual binomial distribution calculation of $$\binom{N}{k}p^k(1 - p)^{N - k}.$$

If the experiment does not reach $N$ trials, then there must have been $M$ successes, and the $M$-th success appears before the $N$-th trial. Summing these probabilities based on where the $M$-th success appears, we get $$\sum_{k = M}^{N - 1} \binom{k}{M}p^M (1 - p)^{k - M}.$$

We can put these two observations together to express the expected value as $$E(X) = \sum_{k = 0}^M k \binom{N}{k}p^k(1 - p)^{N - k} + M \sum_{k = M}^{N - 1} \binom{k}{M} p^M (1 - p)^{k - M}.$$ The first sum can be rewritten as $$N p \sum_{k = 0}^{M - 1} \binom{N - 1}{k} p^k (1 - p)^{N - k - 1}$$ using the same technique one does to prove that the expected value of the binomial distribution is $N p$; but other than that, there probably isn’t a nicer way to express these partial sums of a binomial expansion.