I have a question regarding whether a probability distribution follows the poisson distribution or not. However, I'm having trouble calculating the estimated mean of the poisson (which I need to make a contingency table and solve the problem) given the sample. The question is as follows.
In a period of 100 minutes there were a total of 190 arrivals at a highway toll booth. The accompanying table shows the frequency of arrivals per minute over this period. Test the null hypothesis that the population distribution is Poisson.
Number of arrivals in minutes: 0 1 2 3 4 or more Observed frequency: 10 26 35 24 5
Now, if it was just 4 instead of 4 or more, I know I could estimate the mean by summing each observation, and then dividing by the total frequency. How do I deal with the 4 or more term though? I feel like it has something to do with the fact that we had 190 arrivals. I apologize for the horrible formatting. I've alternatively italicized and bolded the values so it is easier to see which value corresponds to which.
The subject line of your question does not ask the same question as the body of the question.
If we assume that the distribution of arrivals is Poisson, and the goal is to estimate the rate $\lambda$ given the data, this can be done through maximum likelihood estimation. If $X$ is the random number of arrivals in one minute, then $$\Pr[X = x] = e^{-\lambda} \frac{\lambda^x}{x!}, \quad x = 0, 1, 2, \ldots.$$ Thus $$\Pr[X \ge 4] = 1 - \Pr[X \le 3] = 1 - e^{-\lambda}\left(1 + \lambda + \frac{\lambda^2}{2} + \frac{\lambda^3}{6} \right).$$ It follows that the likelihood function for the observed sample of frequencies $\boldsymbol n = (n_1, n_2, n_3, n_4, n_5)$ is $$\begin{align*} \mathcal L(\lambda \mid \boldsymbol n) &= \Pr[X = 0]^{n_1}\Pr[X = 1]^{n_2}\Pr[X = 2]^{n_3}\Pr[X = 3]^{n_4} \Pr[X \ge 4]^{n_5} \\ &= \left( e^{-\lambda} \right)^{10} \left(e^{-\lambda} \lambda\right)^{26} \left(e^{-\lambda} \frac{\lambda^2}{2}\right)^{35} \left(e^{-\lambda} \frac{\lambda^3}{6}\right)^{24} \left(1 - e^{-\lambda}(1 + \lambda + \tfrac{1}{2}\lambda^2 + \tfrac{1}{6}\lambda^3)\right)^{5} \\ &\propto e^{-95\lambda} \lambda^{168} (6 - e^{-\lambda}(6+6\lambda+3\lambda^2+\lambda^3))^5 . \end{align*}$$ The log-likelihood is $$\ell(\lambda \mid \boldsymbol n) = -95 \lambda + 168 \log \lambda + 5 \log(6 - e^{-\lambda}(6+6\lambda+3\lambda^2+\lambda^3))$$ and its derivative is $$\frac{\partial \ell}{\partial \lambda} = -95 + 168\lambda^{-1} - \frac{5\lambda^3}{(6+6\lambda+3\lambda^2+\lambda^3)-6e^\lambda}.$$ Unfortunately, finding the critical value can only be done using numerical methods; we get $$\hat\lambda \approx 1.9047146490803639297$$ as the maximum likelihood estimator.
The result of this estimate serves as a basis for testing a goodness-of-fit hypothesis. See this reference beginning on page 84, for the derivation of the test statistic, which is $$ \Lambda = 2 \sum_{i=1}^k n_i \log \frac{n_i}{n \pi_i(\hat\lambda)} \sim \chi^2_{k-2},$$ where $\pi_i$ is the expected probability of observing $X \in x_i$ under the assumption that $\lambda = \hat\lambda$, $n_i$ are the observed frequencies of $X \in x_i$, and $n = \sum n_i$ is the total number of observations. Here, there are $k = 5$ groups of frequencies collected, with $x_1 = \{0\}$, $x_1 = \{1\}$, and so on, until $x_5 = \{4, 5, 6, \ldots\}$ being the final "tail" group.
Using this, your value of the test statistic should be $$\Lambda \approx 12.58533.$$ Then compare this result against the sampling distribution of the test statistic, and compute the $p$-value.