I have a process/model the depends on a single parameter $\lambda$ that generates $n+1$ outcomes. From $N$ events I can estimate the probabilities $\hat{p}_k$ from $N_k/N$ using a MLE for a multinomial distribution (MD). From my model I can generate probabilities $p_k$.
How do I construct the best estimate of $\lambda$ and an uncertainty $\Delta \lambda$? and how do I decide if my model is reasonable?
Here is what I know from various searches and simulation tests. For a given set of data $N \hat{p}$ with covariance matrix $C$, I can use $$ (N\hat{p}-Np)^T C^{-1} (N\hat{p}-Np) $$ and choose the ${p_k}$ that minmizes this function. That is not directly helpful in this case primarily because $C$ for a MD is not invertible. However it has a singular valued decomposition $$ C=U S U^T $$ and the zero singular value is associated with $\sum p_k=1$, which is also true for any estimates of $p_k$ I get from my N events. So I can construct the inverse in the usual way by ignoring the zero singular value when I invert S and leaving that entry as zero. For any given set of ${p_k}$, the function so constructed will be distributed as a $\chi^2$ with $n$ degrees of freedom, which I assert based on numerical simulations.
So my thinking was to take the estimates ${\hat{p}_k}$ that I get from my $N$ events, and choose the $\lambda$ that minimizes the function above. When doing such a chi-square fit, the minimum value tells me something about how good the fit is from the number of degrees of freedom, and parameter uncertainties estimated from knowledge of $C$. However, it is not at all clear to me if this is even a valid procedure or, if it is, what is the number of degrees of freedom associated with the fit or what boundaries to use for estimating $\Delta \lambda$.