Applying CLT to Poisson Distribution

1.6k Views Asked by At

If $X$ is a Poisson random Variable with parameter $n$, how large need $n$ to be so that

$\mathbb{P}\left(\left\lvert\frac{X}{n}-1 \right\rvert> 0.01\right) < 0.1$?

Attempt: Noting that $X$ is a sum of $n$ identically distributed Poisson Random variables with variable 1, we find that $X/n$ is the sample mean, so $\mathbb{P}(\frac{\frac{X}{n} - 1}{\sqrt n}>0.01)$, but there is an extra $\sqrt n$ in the denominator. How to proceed?

1

There are 1 best solutions below

0
On

Let $X \sim Poisson(n)$. You are correct that $$ X = \sum_{i=1}^n X_i $$ where each $X_i \sim Poisson(1)$ and independent. Thus $X$ is indeed a sum of i.i.d. random variables with finite mean and variance, so we may apply the CLT.

First note $E(X) = Var(X) = n$. Then $$ P\left(\left| \frac{X}{n} - 1\right| > 0.01\right) = P\left(\left| \frac{\frac{X}{n} - 1}{\sqrt{n}}\right| > \frac{0.01}{\sqrt{n}}\right) = P\left(\left| \frac{X - n}{\sqrt{n}}\right| > 0.01\sqrt{n}\right). $$ The first equality is from dividing both sides by $\sqrt{n}$ and the second equality is from multiply both sides by $n$. The reason to do this is now we can invoke the CLT: \begin{align*} P\left(\left| \frac{X - n}{\sqrt{n}}\right| > 0.01\sqrt{n}\right) & \approx P(|Z| > 0.01\sqrt{n}) \\ & = P(Z < -0.01\sqrt{n}) + P(Z > 0.01\sqrt{n}) \\ & = 2(1 - \Phi(0.01\sqrt{n})). \end{align*} The first approximate equality is form the CLT, the second is breaking up the event in question into two disjoint events, and the third is by applying symmetry arguments to the normal distribution, where $\Phi$ is the standard normal cdf.

Now simply set \begin{align*} & 2(1 - \Phi(0.01\sqrt{n})) < 0.1 \\ & \iff 1 - \Phi(0.01\sqrt{n}) < \frac{0.1}{2} \\ & \iff 1 - \frac{0.1}{2} < \Phi(0.01\sqrt{n}) \\ & \iff \Phi^{-1}\left(1 - \frac{0.1}{2}\right) < 0.01\sqrt{n} \\ & \iff \left(\frac{1}{0.01}\Phi^{-1}\left(1 - \frac{0.1}{2}\right)\right)^2 < n. \end{align*} Lines 1-3 are simple algebra, the fourth line is since $\Phi$ is strictly increasing, and the fifth is because the square function is strictly increasing for positive numbers (just justifying the inequalities still hold).

You could use, e.g. MATLAB's norminv function to evaluate this for which I got $$ n > 27,056. $$