How to write R program to solve the confidence interval?

454 Views Asked by At

The problem: let $X_1,\ldots,X_n$ be random variable from $\mathrm{Poisson}(\theta)$. Under $H_0: \theta=\theta_0$, we want to find the $(1-\alpha)100\%$ confidence interval for $\theta$ by using the likelihood ratio interval, $-2\ln\lambda(x)$. Now, I have $$1-\alpha = P\{2n[(\theta-\bar{X})-\bar{X} \ln(\theta/\bar{X})] \leq \chi^2_{0.05,df.=1}\}$$ where $2n[(\theta-\bar{X})-\bar{X}\ln(\theta/\bar{X})]$ has chi-square distribution with df. one for large sample size. This is not a formula for the upper limit of $\theta$. Thus I must use the R program to solve it but I have no idea. Please suggest me how to do and for improve my work. Thanks.

1

There are 1 best solutions below

2
On

Confidence Intervals for Poisson Mean

Here are some clues that I hope are helpful. I am doing a grid search for the boundaries because I have no idea what methods you have been taught for that. Also, I do not know what kind of programming structure you are supposed to use, so I will just show you the steps I used to implement this method in one case.

This is more a 'proof of concept' than an answer, but it is interesting to know whether the method works to produce apparently useful CIs.

As you can see, I sampled $n = 100$ observations from $Pois(\theta = 5)$ and then found a '95% confidence interval' (hoping it would contain $\theta = 5,$ which happens to be the case). It does seem that this method works.

This is an asymptotic criterion, and "infinity is a long way off." If you want to verify that the CI truly has 95% coverage probability, you would have to repeat the process several thousand times with the same $\theta$ and sample size, and count the percentage of iterations that produce an interval including 5. (But I do not see that as part of the problem as stated.)

 n = 100;  x = rpois(n, 5);  xbar=mean(x)    # data
 th = seq(.1,10,by=.001);  q = qchisq(.95,1) # grid to search
 crit = 2*n*(th-xbar-xbar*log(th/xbar))      # criterion
 min(th[crit <= q])                          # lower end of CI
 ## 4.776
 max(th[crit <= q])                          # upper end of CI
 ## 5.67

From what you have revealed about your situation, this is the best I can do. If you can't see how to modify what I have shown into a suitable 'program', then please leave a comment with additional information and maybe I (or someone else) can be more helpful.

Note: As in the Comment from @Michael Hardy, I do not see the relevance of the statement of $H_0,$ but any $\theta_0$ in the interval would be "non-rejectable" as as a null hypothesis against a two-sided alternative.

Addenda: (1) The following code (executed after the code above) will make a graph (not shown here) that helps visualize what the grid search does. (See comment by @Bey.)

 cond = (th>4 & th<6)      # part of curve to plot
 plot(th[cond], crit[cond], type="l")
 abline(v=5.21)                       # sample mean
 abline(v=4.776, col="red")           # lower conf limit
 abline(v=5.670, col="red")           # upper conf limit
 abline(h=q, col="blue")              # chi-sq value

(2) A very similar interval estimate that is a lot easier to handle comes from a Bayesian context with an "improper" gamma prior, Poisson likelihood, and, for the same data as above, posterior $Gamma(shape=521, rate=100).$ Such Bayesian interval estimates are sometimes used as (frequentist) CIs. In this case, the computation is:

 qgamma(c(.025,.975), sum(x), n)
 ## 4.772175 5.666765

This computation might be used to judge the extent of values that must be included in the grid vector 'th'.