Confidence intervals on maximum likelihoods of observed data

156 Views Asked by At

I observed 400 episodes of nursing care in a hospital. I tracked the movement of the nurses between 5 rooms $A-E$. The maximum likelihood of them moving from room $i\rightarrow j$ is given by:

\begin{equation} P_{ij}=\displaystyle \dfrac{\text{# of times from room $i\rightarrow j$}}{\displaystyle \text{Total # of transitions to any room}}\end{equation}

  • Is there a way of defining a confidence interval on this maximum likelihood estimate $P_{ij}$?
  • And for all maximum likelihood estimates of all possible room combinations?

Reference:

I have come across a reference: http://arxiv.org/pdf/0905.4131v1.pdf This suggests that for n observations $X_i$, the empirical maximum likelihood estimate $\hat P_{ij}$ minus the actual transition probability $P_{ij}$ would tend to a multivariate normal distribution with mean 0 and matrix of variance-covariances $\Sigma$.

$$\sqrt{n}|\hat{P_{ij}}-P_{ij}|\sim N(0,\Sigma)\quad \text{as}\quad n\rightarrow \infty$$

How to I calculate $\Sigma$ from my observed data? And how does this relate to confidence intervals?

1

There are 1 best solutions below

7
On

One could probably think about the nature of the dependence among the various random variables that would have been observed, but for now I'll do something simpler:

You have $n$ independent Bernoulli trials; in this case $n=400$. You have $x$ successes; in this case, $x$ is the numerator in the fraction. So the number of successes is binomially distributed with an unobservable parameter $p$. Theory tells us the expected value of the number of successes is $400p$ and the variance of the number of successes is $400p(1-p)$. That means the expected proportion of successes is $p$ and the variance of the proportion is $p(1-p)/400$. So we can cite the central limit theorem and we have an approximately normal distribution; thus about a $0.95$ chance of being between $-1.96$ and $+1.96$. We use $x/400$ as an estimate of $p$. Our $95\%$ confidence interval therefore has endpoints $$ \frac{x}{400} \pm 1.96\sqrt{\frac{(x/400)(1-(x/400))}{400}} $$

In many textbooks, you'll see a section called something like "Confidence interval for a proportion" that covers this.