Estimating $\lambda$ in a Poisson Distribution from a set of data

693 Views Asked by At

I need to estimate $\lambda$ from this data. enter image description here

The observed frequencies / probabilities are obtained by doing each total number observed divided by $280$.

I know that $P(X=0) = \frac{e^{-\lambda}\cdot \lambda^0}{0!} = 0.514$, so this gives $\lambda = 0.666$. My notes say this is a correct way of doing it.

However, my notes also say I can solve by obtaining a sample average to give $\lambda = 0.684 $. My notes say this gives the theoretical values in the table. How do I do this ? I don't know how the theoretical values were obtained.

2

There are 2 best solutions below

2
On

If $f(v)$ is the observed frequency for value $v$ (i.e. the number of times $v$ was observed, divided by the number of trials), then the sample average is $\sum_v v f(v)$, where the sum is over all observed values. The sample average is an unbiased estimator of the mean of the distribution. Since the mean of the Poisson distribution is $\lambda$, you can use this to estimate $\lambda$.

The "theoretical" values in the table are then obtained using the formula for the Poisson distribution, $$\mathbb P(X=x) = e^{-\lambda} \frac{\lambda^x}{x!}$$

Note that $\lambda^x$ is in the numerator, not denominator. Of course for $x=0$ it doesn't make a difference.

0
On

If I were given this question, with the observed frequencies in the table, and nothing else, I would use maximum likelihood to compute $\lambda$. The likelihood function is $$\begin{align} \mathcal L(\lambda \mid \boldsymbol x) &\propto \prod_{k=0}^4 \Pr[X = k]^{\sum \mathbb 1(x_i = k)} \\ &= \left(e^{-\lambda} \frac{\lambda^0}{0!}\right)^{144} \left(e^{-\lambda} \frac{\lambda^1}{1!}\right)^{91} \left(e^{-\lambda} \frac{\lambda^2}{2!}\right)^{32} \left( e^{-\lambda} \frac{\lambda^3}{3!}\right)^{11} \left(e^{-\lambda} \frac{\lambda^4}{4!}\right)^{2} \\ &\propto e^{-280\lambda} \lambda^{196}, \end{align}$$
where we ignore any constant factors with respect to $\lambda$. Then the log-likelihood is $$\ell(\lambda \mid x) \propto -280 \lambda + 196 \log \lambda,$$ and its derivative is $$\frac{\partial \ell}{\partial \lambda} = -280 + \frac{196}{\lambda}.$$ So the log-likelihood is maximized at a critical point satisfying $\partial \ell/\partial \lambda = 0$, i.e. $$\hat \lambda = \frac{196}{280} = 0.7.$$

It is not a difficult exercise to show that the sample mean corresponds to the MLE when observations are not censored (i.e., the exact values are known). That is not to say, however, that no other estimators exist: indeed, the approach $$\Pr[X = 0] = e^{-\lambda} = \frac{144}{280}$$ does yield an estimate, $\tilde \lambda \approx 0.664976$. But we know that from the theory of sufficient statistics, this estimator is not a sufficient statistic: it discards information about $\lambda$ that is present in the original sample, namely the frequencies of observations that are nonzero. In a sense, this estimator is inferior: although it is consistent, it has a larger variance than the MLE. A good intuitive way to understand this is to note that if a sample is drawn from a Poisson distribution with a very large $\lambda$, say $\lambda \approx 100$, we would expect to see a zero with probability $\Pr[X = 0] = e^{-100} \approx 3.72008 \times 10^{-44}$. In practical terms, that is "never." So using the frequency count of zeroes to estimate $\lambda$ in such a case is extremely unlikely to yield a meaningful estimate, since you will not have observed any. But even with a small sample size, say $n = 10$, we might observe something like $$(80, 97, 81, 92, 102, 91, 94, 97, 93, 94),$$ in which case we use the entire sample to compute $\hat \lambda = 92.1$.