The Question:
Suppose $X_1,\dots,X_{n}$ are independent and identically distributed random variables that take values in $\{0,1,2,\dots \}$. We gather the following data:
\begin{array} \\ \text{Value} & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & ≥11 \\ \text{Frequency} & n_0 & n_1 & n_2 & n_3 & n_4 & n_5 & n_6 & n_7 & n_8 & n_9 & n_{10} & n_{11} \end{array}
Let $\Bbb P(X_k=i)=\pi_i$ with $\sum_{i=0}^{11}\pi_i=1$, and we test the null hypothesis
$$H_0:\pi _i=\pi_i (\theta)$$
(where $\theta$ is an unknown parameter) against the general alternative.
(i) If we want to model the probabilities $\pi_i(\theta)$ with a Poisson distribution of mean $\theta$, explain how you could estimate each $\pi_i(\theta)$.
(ii) Having fit the Poisson model to the data (the $n_i$ are known of course), we find that the Likelihood Ratio Statistic is $\Lambda = 14$. Test the goodness of fit for the Poisson model and explain your conclusion.
[You may assume that $\Bbb P(\chi^2_{10} ≤ 14) = 0.82$ , $\Bbb P(\chi^2_{11} ≤ 14) = 0.77$ , $\Bbb P(\chi^2_{12} ≤ 14) = 0.70$ ]
My Answer:
(i) By definition of the Maximum Likelihood Estimator (MLE), we should estimate each $\pi_i(\theta)$ with
$$\pi_i(\theta) = \frac{e^{-\hat \theta}\hat \theta^i}{i!} \qquad i=0,1,\dots,10 \\ \pi_{11}(\theta) = 1 - \sum_{i=0}^{10}\pi_i(\theta)$$
where $\hat \theta$ is the MLE of $\theta$ for the given data, as this would give the most information.
(ii) The null hypothesis has $1$ degree of freedom, since we have specified the values of each $\pi_i(\theta)$ through one (variable) parameter $\theta$.
On the other hand, the general alternative has $11$ degrees of freedom as we have $12$ variables $\pi_0(\theta),\dots,\pi_{11}(\theta)$ with the one constraint $\sum_{i=0}^{11}\pi_i(\theta)=1$.
Hence, $\Lambda(\theta) \sim \chi_{11-1}^2 = \chi_{10}^2$ assuming $n$ is large. We are given that $\Bbb P(\chi^2_{10} ≤ 14) = 0.82$, so $\Bbb P(\chi^2_{10} ≤ 14) = 0.18$.
The $p$-value is $0.18$ which is quite large, giving little to no evidence for rejecting the null.
.
I feel that, especially in part (i), I have not explained it very well. Are my answers correct? Why does it even matter in part (i) how we try to estimate the $\pi_i(\theta)$?
Any help would be much appreciated.