How to find a confidence interval of a binomial distribution using a simulated random sample?

876 Views Asked by At

I have a random sample of 1000 values of deviates from binomial distribution with n = 52 and p^ So I have 1000 values from the distribution.

How can I find a 95% confidence interval for the true value of p? (Without using normal distribution approximations).

It seems I just have random values as a sample to find two values from within which have a 95% probability of containing p?

1

There are 1 best solutions below

0
On

By p^, I assume you mean $\hat p = 44/52,$ the estimate of the cure rate $p$ as found in the experiment done by the manufacturer of the new drug.

In R, $B = 1000$ observations from $\mathsf{Binom}(n=52,\, p = 44/52)$ are generated as follows:

set.seed(2019);  x = rbinom(1000, 52, 44/52)

Then we see that (a centrally located) 95% of the values $x/52$ lie within the interval $(0.73, 0.94).$

set.seed(318);  x = rbinom(1000, 52, 44/52)
quantile(x/52, c(.025, .975)) 
     2.5%     97.5% 
0.7307692 0.9423077 

This is close to the same interval we get with the Wald 95% confidence interval $\hat p \pm 1.96\sqrt{\frac{\hat p(1-\hat p)}{52}},$ which amounts to $(0.748, 0.944).$

p.est = 44/52;  pm=c(-1,1) 
p.est + pm*1.96*sqrt(p.est*(1-p.est)/52)
[1] 0.7480870 0.9442207

Notes: (1) Of course, without simulation, we could have obtained a more accurate version of the first interval in R, using the quantile function (inverse CDF) of the appropriate binomial distribution.

qbinom(c(.025,.975), 52, 44/52)/52
[1] 0.7500000 0.9423077

(2) Especially for $n$ as small as 52, the Agresti-Coull ("plus four") CI has better coverage properties than the (asymptotic) Wald interval. Perhaps see this Q & A.

n = 56; p.est = 46/n;  pm=c(-1,1) 
p.est + pm*1.96*sqrt(p.est*(1-p.est)/52)
[1] 0.7173299 0.9255273