Confidence interval of functional - R code

129 Views Asked by At

Question 1 - Generate 200 data points from the random variable $X \sim exp(2)$, and construct a 95% CI for the functional $\theta=E[e^{X}]$.

My R code is as follows:

x <- rexp(200, rate=1/2)
theta.hat <- mean(exp(x))
sigma.hat <- sd(exp(x))
error <- qnorm(0.975)*sigma.hat/sqrt(200)
left <- theta.hat-error
right <- theta.hat+error

Then (left,right) is a 95% confidence interval for $\theta$. Is this confidence interval correct?

Question 2 - Make a simulation with N=1000 data samples to check what the confidence intervals covering probability is.

My R code is as follows:

N <- 1000
count <- 0
          for(i in (1:N)){
            x <- rexp(200,rate=1/2)
            theta.hat <- mean(exp(x))
            sigma.hat <- sd(exp(x))
            error <- qnorm(0.975)*sigma.hat/sqrt(200)
            left <- theta.hat-error
            right <- theta.hat+error
            count <- count+as.double(left<=2 & 2<=right)}
count/N

However after running the code a few times count/N returns values $\approx 0.76$. I was expecting values $\approx 0.95$? Have I made a mistake or am I missing something?

1

There are 1 best solutions below

2
On

I don't know if this is exactly what you're looking for but it might help you understanding what is going on.

You could get a $95\%$ confidence interval in three different ways:
(1) Monte Carlo
(2) Calculate the distribution of $e^X$ and the distribution of the mean of $e^X$
(3) Central Limit Theorem

You used the third one in question 1. However, as Ian noted, the variance is not finite. In that case, you cannot use it because, well, the variance is infinite. Moreover, you estimated the variance with the sample variance but, in general, it is better if you could plug in the actual variance there if you know how to calculate that one. Furthermore, you found out in Question 2 that the covering probability is not close to $.95$ and this is thus due to the statistical error that is introduced when the Central Limit Theorem is used.

I haven't tried the second method here and I don't know whether it is possible to do it exactly (I don't think it is, to be honest). But in general this might be useful if, for example, when the mean is distributed according to a known distribution (such that you can calculate the parameters).

Last but definitely not least, you can always approximate the distribution of the mean of $e^X$ using Monte Carlo methods, which is the first option I gave and essentially your solution is Question 2.

I hope this is what you're looking for. If some details are unclear, feel free to comment!