Approximating binomial random variable as normal random variable

60 Views Asked by At

During lecture, my probability professor told me that if Stirling formula is applied to $X\sim\text{Bin}(n,\frac12)$ PMF,

$$p(k)={n \choose k} \left( \frac12 \right)^n$$

$$p(k)=\frac{\sqrt{2\pi n}\left(\frac{n}{e}\right)^n}{\sqrt{2\pi k}\left(\frac{k}{e}\right)^k \sqrt{2\pi (n-k)}\left(\frac{n-k}{e}\right)^{n-k} 2^n}$$

approximates to normal random variable

$$p\left(\frac{n}{2}+t\right)\approx\frac{1}{\sqrt{\pi n}}\exp\left( -\frac{t^2}{2n} \right)$$

when $n\gg 1$ and $t\ll n$. I am not sure what are intermediate steps to get that approximation. How it is done?

2

There are 2 best solutions below

2
On

Let's start with the Stirling approximation:

$n! \approx \sqrt{2 \pi n}\left({n \over e}\right)^n$

Now let's write the binomial PMF as:

$p(k) = {n! \over k! (n-k)!} \left({1 \over 2}\right)^n$

Substituting Stirling's formula and $k = n/2+t$ you get (after some algebra):

$p(k) \approx {1 \over \sqrt{2 \pi}} \sqrt{{n \over (n/2+t)(n/2-t)}} \left( {n \over n/2-t} \right)^{n/2-t} \left( {n \over n/2+t} \right)^{n/2+t} \left({1 \over 2}\right)^n$

Finally you can use approximation:

$\left( {n \over n+a} \right)^n \approx e^{-a}$ for large $n$ in the third and fourth terms above.

For example:

$\left( {n \over n/2+t}\right)^{n/2+t} = 2^{n/2+t} \left[ \left({n \over n+2t}\right)^{n+2t} \right]^{1/2} \approx 2^{n/2+t} (e^{-2t})^{1/2} = 2^{n/2+t} e^{-t}$

0
On

This is personal opinion : I think that the simplest form of Stirling approximation is $$\log (\Gamma (x))=x (\log (x)-1)+\frac{1}{2} (\log (2 \pi )-\log (x))+\frac{1}{12 x}+O\left(\frac{1}{x^3}\right)$$ So, writing

$$p\left(\frac{n}{2}+t\right)=\frac{2^{-n} \,\,\Gamma (n+1)}{\Gamma \left(\frac{n}{2}-t+1\right)\,\, \Gamma \left(\frac{n}{2}+t+1\right)}$$

take logarithms first, use this form of Stirling approximation three times and finish with Taylor expansion to get (I give many terms in order you can push the approximation to higher levels) $$\log\Bigg[p\left(\frac{n}{2}+t\right) \Bigg]=-\frac{1}{2} (\log (n)+\log (2 \pi )-2 \log (2))-$$ $$\frac{8 t^2+1}{4 n}+\frac{2 t^2}{n^2}+O\left(\frac{1}{n^3}\right)$$

Continuing with Taylor series using the fact that $$x=e^{\log(x)}$$ then $$p\left(\frac{n}{2}+t\right) =\sqrt{\frac{2}{\pi n} }\,\exp\Bigg[-\frac{8 t^2+1}{4 n}+\frac{2 t^2}{n^2}+O\left(\frac{1}{n^3}\right) \Bigg]$$

I suppose that, now, you see what told your professor (simplifying a bit more) and how you could (not necessary I think) improve the approximation.