Approximate the probability $P(3000\leq H\leq 4000)$ within a factor of $10$

50 Views Asked by At

I'm working on a probability exercise and it asks to approximate within a factor of $10$ the probability of getting $3000$ or more heads out of $4000$ fair coin tosses. I was also hinted that

The following common logarithms are accurate to roughly one part in $4000$: $\log 2 = 0.301$, $\log 3 = 0.477$.

So the exact answer to this is $$ P(H\geq 3000) = \sum_{n=3000}^{4000}\binom{4000}{n}\left(\frac{1}{2}\right)^{4000} $$ which is the tail of a Binomial distribution. I was thinking about Normal Approximation of Binomial distribution, but I couldn't show by hand the error estimation of such method. Can you give me some hint how to approach this? Here's my attempt so far:

We can approximate $P(H=n)$ using normal distribution with mean $\mu=\frac{1}{2}\cdot 4000 = 2000$ and standard deviation $\sigma = \sqrt{\frac{1}{2}\cdot\frac{1}{2}\cdot 4000} = 10\sqrt{10}$: $$ P(H=n)\approx \frac{1}{\sqrt{2\pi}(10\sqrt{10})}\exp\left( -\frac{(n-2000)^2}{2\cdot 1000} \right) =\frac{e^{-x^2/2}}{20\sqrt{5\pi}}. $$ where $x=\frac{n-2000}{10\sqrt{10}}$. Consequently, $$ P(H\geq n) \approx \frac{1}{20\sqrt{5\pi}}\int_{x}^\infty e^{-t^2/2}\,dt. $$ By using integration by part repeatedly, we obtain: $$ \begin{align*} \int_x^\infty e^{-t^2/2}\,dt &= e^{-x^2/2}\left(\frac{1}{x}-\frac{1}{x^3}+\frac{3}{x^5}-\frac{3\cdot 5}{x^7}+\frac{3\cdot 5\cdot 7}{x^9}-\ldots\right)\\ &=\frac{e^{-x^2/2}}{x}\left( 1-\sum_{k=0}^\infty\frac{(-1)^{k}(2k+1)!!}{x^{2k+2}} \right). \end{align*} $$ From this, we can approximate: $$ P(H\geq 3000)\approx \frac{e^{-500}}{10\sqrt{10}}\left(1-\frac{1}{10^3}+\frac{3}{10^5}-\frac{3\cdot 5}{10^7}+\ldots\right). $$ The problem is this is still too difficult to compute by hand, and even numerical approximation using 100 terms in the series has an error of factor $\approx 4.3873\times 10^{-9}$.

1

There are 1 best solutions below

2
On

$$P(H\geq n) \approx \frac{1}{20\sqrt{5\pi}}\int_{x}^\infty e^{-t^2/2}\,dt=\frac{1}{20\sqrt{5\pi}}\sqrt{\frac{\pi }{2}} \text{erfc}\left(\frac{x}{\sqrt{2}}\right)=\frac{1}{20 \sqrt{10}}\text{erfc}\left(\frac{x}{\sqrt{2}}\right)$$ $$x=\frac{n-2000}{10\sqrt{10}}\implies P(H\geq n) \approx \frac{1}{20 \sqrt{10}}\left(1-\text{erf}\left(\frac{n-2000}{20 \sqrt{5}}\right)\right)$$ $$n=3000\implies P(H\geq 3000) \approx \frac{1}{20 \sqrt{10}}\left(1-\text{erf}\left(10 \sqrt{5}\right)\right)=2.84 \times 10^{-220}$$ while

$$\sum_{n=3000}^{4000}\binom{4000}{n}\left(\frac{1}{2}\right)^{4000}=1.25 \times 10^{-220}$$

If you want to compute $$1-\text{erf}(x)$$ when $x$ is large, use $$1-\text{erf}(x)=\frac{e^{-x^2}}{\sqrt{\pi } x} \frac{2 \left(2 x^4+9 x^2+4\right) }{4 x^4+20 x^2+15 }$$ For $x=10 \sqrt{5}$ it will give $$\frac{504504}{5050075 e^{500} \sqrt{5 \pi }}=1.796 \times 10^{-219}$$ while the exact value is ... the same.

Edit

If you want a simpler approximation, just use the fact that $$\frac{e^{-x^2}}{\sqrt{\pi } x} \left(1-\frac{1}{2 x^2}\right) <1-\text{erf}(x) <\frac{e^{-x^2}}{\sqrt{\pi } x}$$ If you