Find a 95% confidence interval on a binomial process.

570 Views Asked by At

Let's say that $73\%$ of $1506$ people interviewed were in favor of legalizing gay marriage. What is the $95\%$ confidence interval for the percentage of the public that are in favor of legalizing gay marriage?

I can see that this is a binomial process (either you're in favor or you're not). I haven't done this kind of problem before though so I'm not sure what to do next. Do I say that because this is a large sample the Central Limit Theorem indicates that this is approximately standard normal? I then run into the issue where I don't know what $\sigma$ is, so do I then try to use the T distribution? I'm floundering looking for an approach here!

2

There are 2 best solutions below

2
On BEST ANSWER

Your statistics course probably wants you to reason as follows:

The binomial distribution is well approximated by a Gaussian distribution with mean of $073 \cdot 1506 = 1099.38$ and sigma of $\sigma = \sqrt{1506 \cdot 0.73 \cdot 0.27} = 17.23$. In the gaussian distribution the 95% confidence interval usually quoted is $1.96 \sigma = 33.75$ so the range is $1065.62 - 1133.14 = 70.8\% - 75.3\%$.

There are significant subtleties that this glosses over, but htis gives you the road to the answer.

0
On

First off, it depends heavily on how the people were chosen to be interviewed. For instance, if they were grabbed from a roster of various church congregation lists, then you cannot have any degree of confidence that they represent the public at large.

If you make the simplifying assumption that you took a simple random sample, then the number of people in your sample supporting the legalization of gay marriage is not binomially distributed, but rather hypergeometrically distributed. If the sample is very small compared to the population, then this is approximately binomial. And if the sample still is rather large, then yes, it is standard to invoke the Central Limit Theorem and say that the distribution is approximately normal. You can then make a confidence interval using the normal cumulative distribution function.

As you note, the missing ingredient here is the standard deviation of your sample average. If the proportion of the population that supports legalizing gay marriage is p, then the variance of each of your random variables (Bernoulli variables with parameter p) is p(1-p). Assuming they are independent (not true unless the sample is small compared to the population), then the variance of the distribution of your sample average is $p(1-p)/n$ and the standard deviation (i.e., standard error) is $\sqrt{p(1-p)/n}$.

Without knowledge of population parameter, you cannot determine this standard deviation for sure. The standard technique here is to "bootstrap". You assume that your sample percentage $\hat{p}$ is approximately right, and you use it in place of $p$. Typically, the error in your approximation doesn't cause that much error in this computation of the standard error.

If your sample is large relative to the population size, then you need to introduce a "correction factor" because your distribution is actually hypergeometric, and only approximately binomial. Here, you multiply the above standard error by $\sqrt{1-\frac{n-1}{N-1}}$, where $n$ is the sample size and $N$ is the population size.

And if your sample size is very small, then you should worry about how good of an approximation the Central Limit Theorem is giving.