Normal approximation of the Binomial distribution

528 Views Asked by At

So my text books says that if $X \sim B(n,p)$ with $np>5$ and $nq>5$ (where $q=1-p$) then $X$ can be approximated by a normal distribution of $X\sim N(\mu,\sigma^2)$ with $\mu = E(X) = np$ and $\sigma^2 = Var(X) = npq$

So I understand that if n is very large then $X$ will roughly show a normal distribution and that $X$ will have $E(X) = np$ and $Var(X) = npq$ but why must $E(X)>5$ and $nq>5$. and furthermore, what is $nq$ representing?

2

There are 2 best solutions below

0
On

The CLT says the normal approximation is good for a fixed distribution when $n$ is large enough. But when you have another parameter to play with, tweaking that other parameter can slow down the convergence rate (meaning that $n$ must get larger to achieve a given error tolerance). In the case of the binomial distribution, there is a sort of complete classification:

  • When $n,np$ and $nq$ are all large, Bin($n,p$) behaves like N($np,npq$).
  • When $n$ is large but $np$ is not large, Bin($n,p$) behaves like Poisson($np$) and the normal approximation has a large error.
  • When $n$ is large but $nq$ is not large, Bin($n,p$) behaves like $n-$ Poisson($np$) and again the normal approximation has a large error. (This is really the same statement as the one before it, because if $X \sim$ Bin($n,p$) then $n-X \sim$ Bin($n,q$)).

One way to anticipate this might happen in advance is to use a quantitative refinement of the CLT such as the Berry-Esseen theorem. The Berry-Esseen theorem for the binomial distribution gives an estimate for the difference of the CDFs as $C \frac{1}{\sqrt{n}} \frac{pq^3+qp^3}{(pq)^{3/2}}$ where $0.4<C<0.5$ is a constant. The important thing is that ratio involving $p$ and $q$, which behaves as $p^{-1/2}$ as $p \to 0$ and as $q^{-1/2}$ as $q \to 0$. Thus the Berry-Esseen theorem roughly speaking bounds the error by $\frac{C'}{\sqrt{n \min \{ p,q \}}}$ where $C'$ is a new constant. If you plot the actual error you see this kind of scaling although the $C'$ given by the theorem is significantly bigger than the optimal one.

Intuitively what the Berry-Esseen theorem is capturing is that the normal approximation to a distribution is symmetric about its mean, whereas the original distribution in general is not. Thus if a distribution (with the standard deviation scaled out) is highly skewed, then $n$ must become quite large in order to mitigate the effect of this skew.

0
On

Seems like if $q=1–p$, then $nq=(n)(1–p)$ is a scaled estimate of the variance of the binomial distribution.