Let X be a random variable of the negative binomial distribution with parameters $r$ and $p$, $$P(X = n)~=~{n-1\choose r-1} p^r(1-p)^{n-r} .$$ The textbook says the mean is ${r(1-p)\over p}$, which confuses me because I always consider the negative binomial distribution random variable $X$ as the sum of $r$ independent geometric distributed random variable $\tau$, $$P(\tau = n)~=~p(1-p)^{n-1}.$$ Then by linearity of expectation, we get the mean of $X$ should be ${r\over p}$,since the mean of $\tau$ is ${1\over p}$. Can anyone helps me figure it out?
confused about the mean of negative binomial distribution
353 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
The computation of the mean of a negative binomial distribution is usually done by means of a 'differentiation trick'.
First, let's look at a geometric random variable $X$ that counts the number of trials up to the appearance of the first Success in a sequence of Bernoulli trials. $P(X = k) = q^{k-1}p,$ for $k = 1, 2, \dots .,$ where $0 \le p = P(\text{Success}) \le 1$ and $q = 1-p.$
To find the mean:
$$E(X) = \sum_{k=1}^\infty kq^{k-1}p = p\sum_{k=1}^\infty kp^{k-1}.$$
The series is summed as follows:
$$\frac{d}{dq}\left(\sum_{k=1}^\infty q^{k}\right) = \sum_k kq^{k-1},$$
where differentiation within the series is justifiable. Then
$$E(X) = p\sum_k kq^{k-1} = p\frac{d}{dq}\left(\sum_k q^k\right) = p\frac{d}{dq}\left(\frac{q}{1-q}\right),$$ upon summing the geomeric series. Finally,
$$E(X) = p\left(\frac{1}{(1-q)^2}\right) = \frac 1p.$$
Then a negative binomial random variable $Y,$ for the waiting time until $r$ successes is the sum of $r$ geometric random variable for the waiting time until one success. Hence $E(Y) = rE(X) = \frac rp.$
In computational situations, it may be possible to sum many terms of the series for $E(X)$ to get a useful approximation. For example, if $X$ is a geometric random variable with $p = 1/3,$ then $E(X) = 3.$ Consider the following computation in R, in which 100 terms of the series are summed:
p = 1/3; q = 1-p; k = 0:100
pdf = q^(k-1)*p
sum(k*pdf)
[1] 3
The implementation of the geometric distribution in R counts the number of failures before the first success, taking values $0, 1, 2, \dots.$ Thus the sample mean of a million randomly generated independent geometric observations with $p = 1/3$ gives a value $E(X) = 3$ correct to within a 95% margin of error of about $\pm 0.005.$
set.seed(2019)
x = rgeom(10^6, 1/3)
mean(x + 1)
[1] 3.000108
2*sd(x)/sqrt(10^6)
[1] 0.004911733
Note: If you know about moment generating functions, you can find that the MGF of $Y$ is $m_y(t) = \left[\frac{pe^t}{1-qe^6}\right]^r.$ Then to get $E(X)$ take the first derivative of $m_Y(t)$ (with respect to $t)$ and evaluate it at $t=0.$
Ref: Except for notation, the derivation above of $E(X)$ via differentiation is similar to that of Wackerly, Mendenhall, Scheaffer: Math. stat. w/ appl, 6e, p112, Duxbury, 2002.
There are two different conventions for the geometric distribution. One counts the number of trials, has support $\{ 1,2,\dots \}$, and has mean $1/p$. The other counts the number of failures, has support $\{ 0,1,\dots \}$, and has mean $(1-p)/p$.
Summing iid copies of these gives two different kinds of negative binomial distributions. There is also a variant of the negative binomial distribution which counts successes rather than failures, so that $p$ and $1-p$ get switched around. You simply have to be aware of the convention being used in a given context.