Finding a confidence interval for a binomial proportion without knowing the mean or variance?

223 Views Asked by At

I'm just learning statistics and I've been given an interesting problem to solve that I'm unsure how to approach. I've dealt with various tests (t-test, chisq test, confidence intervals etc.) but I'm unsure how to apply it to this problem.

Given 20,000 products, we take a random sample of 320 products and find that 59 of them are faulty. Identify with 95% confidence the confidence interval of the ratio of faulty products.

Since this is all the information we have, I don't know the mean or variance since this was a single trial, or do we assume mean to be 59 and variance 0?.. I've never dealt with this kind of problem before, and I believe I may be overthinking it.

2

There are 2 best solutions below

0
On BEST ANSWER

From comments:

This would be a binomial proportion confidence interval. There are various different approaches.

You do have a sample mean for the faulty proportion, $\frac{59}{320}$, and a positive sample variance

4
On

The link in @Henry's Answer (+1) discusses several styles of confidence for the binomial success probability. The Wald interval should not be used for small $n,$ but with $x = 59$ successes in $n=320$ trials, it should give useful results.

Below, I will illustrate the Wald style of confidence interval and three others.

Wald. To begin we estimate $p$ by $\hat p = x/n.$ Then the variance of $\hat p$ is $\frac{p(1-p)}{n}.$ estimated by $\frac{\hat p(1-\hat p)}{n}.$ So, for $x =59, n=430$ a 95% Wald interval, based on a normal approximation is $$\hat p \pm 1.96\sqrt{\frac{\hat p(1-\hat p)}{n}}.$$ For large $n$ the various approximations become increasingly accurate. For my example, the 95% Wald interval computes to $(0.1047,\, 0.1670),$ using R as a calculator.

p.hat = 59/430
CI.w = p.hat + qnorm(c(0.025,0.975))*sqrt(p.hat*(1-p.hat)/430)
CI.w
 [1] 0.1046887 0.1697299

Agresti-Coull. An interval that works better for smaller $n,$ due to Agresti and Coull uses an estimator of $p$ that is 'shrunken' slightly toward $0.5,$ is $(0.1079,\, 0.1732)$: This style of interval is widely used, perhaps partly because it is simple to compute on a hand calculator. [Perhaps see this Q&A for more on Wald and Agresti CIs.]

p.est = (59+2)/(430+4)
CI.ac = p.est + qnorm(c(0.025,0.975))*sqrt(p.est*(1-p.est)/434)
CI.ac
[1] 0.1078541 0.1732519

Jeffreys. The Jeffreys interval estimate $(0.1072,\, 0.1721)$ comes from a Bayesian background with a non-informative beta prior distribution, but works fine as a 95% frequentist confidence interval.

qbeta(c(.025,0.975), 59.5, 430.5-59) 
[1] 0.1071571 0.1721247

Clopper-Pearson. Finally, an exact interval $(0.1039,\, 0.1699)$ is computed by the procedure binom.test in R. It is "exact" in the sense that it is long enough to ensure at least 95% coverage probability for all possible values of $p.$ [The formulas for the Clopper-Pearson CI are intricate, and so often left for computation by software.]

binom.test(59, 439)$conf.int
[1] 0.1039068 0.1699133
attr(,"conf.level")
[1] 0.95

You might look at the Wikipedia link for more on these and some of the other binomial confidence intervals is common use. For now, you should use whichever one is explained in your textbook or class notes.