Why does $p$ have to be moderate in the Poisson approximation to binomial random variable?

415 Views Asked by At

So the proof that a binomial rv with large $n$ approximates a poisson rv with $\lambda = np$ (given below) doesn't seem to use the fact that $p$ is moderate/small, so why does wikipedia and my textbook (Ross) state this as a condition?

Proof: If $\lambda = np$, then $ P(X = i) = {n \choose i} p^i (1 - p)^{n-i} $
$$ = \frac{(n(n-1) ... (n - i + 1)}{i!} (\frac{\lambda}{n})^i \frac{(1 - \frac{\lambda}{n})^n}{(1 - \frac{\lambda}{n})^i} $$

and since $ \lim_{n\to \infty} \frac{(n(n-1) ... (n - i + 1)}{n^i} = 1$

and $ \lim_{n\to \infty} (1 - \frac{\lambda}{n})^i = 1$

$$ \lim_{n\to \infty} P(X = i) = \frac{\lambda^i}{i!}e^{-\lambda}$$

3

There are 3 best solutions below

10
On BEST ANSWER

The essential problem is that if $\lambda = np$, then $\lambda$ is a function of $n$, and if you take the limit as $n \to \infty$, then $\lambda$ will increase without bound unless $p$ is small. Therefore, any time you write $n \to \infty$ while treating $\lambda$ as a constant independent of $n$ in evaluating the limit, you are implicitly stating that $p \to 0$ as $n \to \infty$.

$$\begin{align*} \Pr[X = i] &= \binom{n}{i} p^i (1-p)^{n-i} \\ &= \prod_{k=1}^{i} \frac{n+1-k}{k} \left(\frac{\lambda}{n}\right)^i \frac{(1-\lambda/n)^n}{(1-\lambda/n)^i} \\ &= \frac{\lambda^i}{i!} \prod_{k=1}^i \frac{n+1-k}{n} \frac{(1-\lambda/n)^n}{(1-\lambda/n)^i}. \end{align*}$$ Now if we take the limit, we get $$\begin{align*} \Pr[X = i] &\approx \frac{\lambda^i}{i!} \lim_{n \to \infty} \prod_{k=1}^i \left( 1 - \frac{k-1}{n} \right) \lim_{n \to \infty} \left( 1 - \frac{\lambda}{n} \right)^n \lim_{n \to \infty} \left( 1 - \frac{\lambda}{n} \right)^i \\ &= \frac{\lambda^i}{i} (1)(e^{-\lambda})(1) \\ &= e^{-\lambda} \frac{\lambda^i}{i!}. \end{align*}$$ But again, this limit treats $\lambda$ as a constant with respect to $n$.


An even easier way to see this is to note that $$(1 - p)^n = (1 - \lambda/n)^n$$ if $\lambda = np$. But if you take the limit of each side with respect to $n$, we find that the LHS tends to $0$ for any $p \in (0,1)$, whereas the RHS tends to $e^{-\lambda}$ if you treat $\lambda$ as a constant with respect to $n$.

3
On

Suppose for example that $p=1/2$ and $n$ is say $100$. Let $i=50$. Then $\frac{n(n-1)(n-2)\cdots(n-i+1)}{n^i}$ is quite far from $1$. So the argument quoted no longer works.

0
On

Suppose you want to approximate $P_i=B_i(n,p)$ for large $n$, small $\alpha=\frac i n$ and small $p$. We'll derive the first correction factor to the $Pois(np)$ approximation $PA_i(\lambda)$. $$P_i=\binom n i p^i(1-p)^{n-i}=e^{-\lambda}\frac {\lambda^i}{i!}\frac {n_{(i)}}{n^i}\frac {(1-p)^{n(1-\alpha)}}{e^{-np}}\approx PA_i(\lambda)(1-\frac{\alpha}{2}+\frac 1{2n})^ie^{n(\alpha p-p^2/2)}=PA_i(\lambda)e^{\alpha/2-n\alpha^2/2+np(\alpha-p/2)}=PA_i(\lambda)e^{\alpha/2-\frac 1 2n(\alpha-p)^2}=PA_i(\lambda)e^{\alpha/2-\frac {(i-\lambda)^2}{2n}}\approx PA_i(\lambda)\exp(\frac 1 2 p(1-\frac{(i-\lambda)^2}{\lambda}+\frac {i-\lambda}{\lambda}))\approx PA_i(\lambda)\exp(\frac 1 2 p(1-\frac{(i-\lambda)^2}{\lambda}))$$

Hence approximation is best when we are one standard deviation away form the mean, underestimates within one std and overestimates outside of one std region - which coincides with the intuition that Poisson is slightly overdispered Binomial - and that it worsens as $p$ increases.