Negative Binomial Distribution with finite number of trials

155 Views Asked by At

Suppose a user tosses a coin $n$ times. How do I compute the expected value of the number of heads before the user sees $k$ tails, ($k < n$)?

This looks somewhat like Negative Binomial Distribution (NBD). However, unlike this problem with a finite number of tosses ($n$), there no cap on the number of tosses in a classic NBD. I am wondering how that derivation would look?

2

There are 2 best solutions below

2
On BEST ANSWER

"before the user sees $k$ tails, ($k < n$)"

thus last tail can range from position $k\;$ to position $(n-1)$,

which means number of heads can range from $0$ to $(n-k-1)$

Thus by symmetry, the expected number of heads before seeing $k$ tails $= \frac{n-k-1}2$

2
On

I left a number of comments, because I was confused about the wording of the problem. After reading this Negative Binomial Distribution article, I deleted my comments. The smoke had cleared.

Suppose a user tosses a coin $n$ times. How do I compute the expected value of the number of heads before the user sees $k$ tails, $(k<n)$?

As is, the wording of this problem is difficult to interpret.

One interpretation is that you are supposed to let $k$ range from $(1)$ to $(n)$. For each value of $k$, you are to compute the probability of there being $(k-1)$ successes (i.e. Tails) in the first $(n-1)$ trials (i.e. Coin Flips), and a success on the $n$th trial.

The probability of this occurring is

$$\frac{\binom{n-1}{k-1}}{2^n}. \tag1 $$

In such a situation, there will be $(n-k)$ failures (i.e. Heads). Therefore, the expected number of failures is

$$\large{\sum_{k=1}^n} ~~\frac{(n-k) \times \binom{n-1}{k-1}}{2^n} $$

$$ = \dfrac{1}{2^n} ~\times \sum_{k=1}^n ~~\left[(n-k) \times \binom{n-1}{k-1}\right] $$

$$ = \dfrac{1}{2^n} ~\times \sum_{k=1}^{n-1} ~~\left[(n-k) \times \binom{n-1}{k-1}\right] $$

$$~~\left[ ~\text{since, when} ~(k = n), ~\text{the} ~(n - k) ~\text{term is} ~= 0\right]$$

$$ = \dfrac{1}{2^n} ~\times \large{\sum_{k=1}^{n-1}} ~~\left\{ ~(n-k) \times \frac{(n-1)!}{[(k-1)!] ~[(n-k)!]} ~\right\}$$

$$ = \dfrac{1}{2^n} ~\times \large{\sum_{k=1}^{n-1}} ~~\left\{ ~\frac{(n-1)!}{[(k-1)!] ~[(n-k-1)!]} ~\right\}$$

$$ = \dfrac{n-1}{2^n} ~\times \large{\sum_{k=1}^{n-1}} ~~\left\{ ~\frac{(n-2)!}{[(k-1)!] ~[(n-k-1)!]} ~\right\}$$

$$ = \dfrac{n-1}{2^n} ~\times \sum_{k=1}^{n-1} ~\binom{n-2}{k-1} $$

$$ = ~~\left\{\text{by re-indexing} ~k ~\right\} ~~\dfrac{n-1}{2^n} ~\times \sum_{k=0}^{n-2} ~\binom{n-2}{k} $$

$$ = ~~\dfrac{n-1}{2^n} ~\times 2^{(n-2)} $$

$$~~\left[ ~\text{since} ~~(1 + 1)^{(n-2)} = \sum_{k=0}^{n-2} \binom{n-2}{k} ~\right] $$

$$ = \frac{n-1}{4}. \tag2 $$


Another possible interpretation is that you are supposed to assume that $n$ is some fixed positive integer, and that $k$ is some fixed element in $\{1,2,\cdots,n\}$, and you are supposed to compute the expected number of failures, in this isolated case.

Here, if $k$ is some fixed element, then from the previous analysis in this answer, the expected number of failures is

$$(n - k) ~\times ~\frac{\binom{n-1}{k-1}}{2^n}. \tag3 $$

(3) above is based on the assumption that the probability of success (i.e. Tails) $~= \frac{1}{2} = ~$ the probability of failure (i.e. Heads).

Personally, I see no way of simplifying the expression in (3) above.

Further, the expression in (3) above may be generalized by assuming that the probabilities of success and failure are $(p)$ and $(q)$, respectively, with $(p + q = 1).$

Then, the expression in (3) above changes to

$$(n - k) ~\times ~\binom{n-1}{k-1} ~p^k ~q^{(n-k)}. \tag4 $$


Finally, there is a 3rd interpretation, which is detailed below. These are the only $(3)$ interpretations of the problem that I can conjure. After reading the "Negative Binomial Distribution" article that I linked to at the start of my answer, I am unsure which of the three interpretations was intended by the problem composer.

Anyway...

The 3rd interpretation is that you are supposed to assume that $k$ is some fixed positive integer, and that $n$ goes from $k$ to infinity. That is, you are supposed to compute the expected number of Heads, until $(k)$ Tails, regardless of how long it takes.

Using the previous analysis, under this interpretation of the problem, you are supposed to assume that $k$ is a fixed positive integer, and you are supposed to compute

$$\large{\sum_{n=k}^\infty} ~\left[ ~(n - k) ~\times ~\frac{\binom{n-1}{k-1}}{2^n} ~\right]. \tag5 $$

An informal approach that I am unsure is either accurate (or valid) is to reason that in $(2k-2)$ coin flips, you expect to get $(k-1)$ Tails and (therefore) $(k-1)$ Heads. This event would have to be scaled by $(1/2)$, since you must also get a Tails on the very next coin flip.

So, you could (informally, blindly) guess that the expected number of Heads is $\dfrac{k-1}{2}.$

$\color{\red}{\text{In fact, the above guess-work is wrong.}}$

A formal approach is shown below.


In (5), with $k$ fixed and $n$ going to infinity, the ratio of successive terms is

$$\frac{n}{2(n-k)} \to \frac{1}{2},$$

so the expression in (5) above is convergent, as expected.

After failing to find some other elegant way of attacking the expression in (5), I decided to use recursion (AKA the uneducated person's substitute for Markov Chains).

Let $E(k)$ denote the expected number of trials (i.e. coin flips) required for $k$ successes (i.e. Tails).

This will allow the expected number of failures to be computed as $E(k) - k.$

Then

$\displaystyle E(1) = \left[ ~\frac{1}{2} \times 1 ~\right] ~+ ~\left[ ~\frac{1}{2} \times ~\left\langle ~1 + E(1) ~\right\rangle ~\right] ~\implies $
$\displaystyle E(1) = 2.$

$\displaystyle E(2) = \left[ ~\frac{1}{2} \times ~\left\langle ~1 + E(1) ~\right\rangle ~\right] ~+ ~\left[ ~\frac{1}{2} \times ~\left\langle ~1 + E(2) ~\right\rangle ~\right] ~\implies $
$\displaystyle E(2) = 4.$

Forming the inductive hypothesis that $E(k) = 2k$, you have that

$\displaystyle E(k+1) = \left[ ~\frac{1}{2} \times ~\left\langle ~1 + E(k) ~\right\rangle ~\right] ~+ ~\left[ ~\frac{1}{2} \times ~\left\langle ~1 + E(k+1) ~\right\rangle ~\right] ~\implies $
$\displaystyle \frac{E(k+1)}{2} = \frac{2k + 1}{2} + \frac{1}{2} = \frac{2k+2}{2} \implies $ $\displaystyle E(k+1) = 2k + 2,~$ which proves the hypothesis.

Therefore, $~E(k) = 2k \implies~$ the expected number of Heads required until the $k$th tail is $E(k) - k = k.$


It only remains to repeat the above analysis, under the assumption that the probability of success and failure is $~(p)~$ and $~(q = 1-p),~$ respectively.

$\displaystyle E(1) = \left[ ~p \times 1 ~\right] ~+ ~\left[ ~q \times ~\left\langle ~1 + E(1) ~\right\rangle ~\right] ~\implies $
$\displaystyle p \times E(1) = 1 \implies E(1) = \frac{1}{p}.$

$\displaystyle E(2) = \left[ ~p \times ~\left\langle ~1 + E(1) ~\right\rangle ~\right] ~+ ~\left[ ~q \times ~\left\langle ~1 + E(2) ~\right\rangle ~\right] ~\implies $
$\displaystyle p \times E(2) = p + pE(1) + q = 2 \implies E(2) = \frac{2}{p}.$

Forming the inductive hypothesis that $E(k) = \dfrac{k}{p}$, you have that

$\displaystyle E(k+1) = \left[ ~p \times ~\left\langle ~1 + E(k) ~\right\rangle ~\right] ~+ ~\left[ ~q \times ~\left\langle ~1 + E(k+1) ~\right\rangle ~\right] ~\implies $
$\displaystyle p \times E(k+1) = 1 + \left[ ~p \times \frac{k}{p} ~\right] = (k + 1) \implies E(k+1) = \frac{k+1}{p},$
which proves the hypothesis.

Therefore, $~\displaystyle E(k) = {k}{p} \implies~$ the expected number of failures required until the $k$th success is
$\displaystyle E(k) - k = \frac{k}{p} - k = \frac{k - kp}{p} = \frac{kq}{p}.$