95% Confidence Interval Confusion

214 Views Asked by At

I decided on an arbitrary bound to determine how large my sample size should be. (There's a formula you can plug in "B" and find how big your sample size should be-size n). I then calculated the estimated pop mean and its variance for the sample sized n. If I do the estimated pop mean+/- 2*sqrt(var) would that be a 95% confidence interval? Or no because I set a bound to determine my sample size? Where does the bound I set come into play? I'm a little confused.

1

There are 1 best solutions below

0
On

Maybe some basic facts about confidence intervals based on normal samples will help you clarify how to proceed.

Suppose you have a random sample $X_i$ of size $n$ from a normal population with mean $\mu$ and standard deviation $\sigma.$ Also suppose that $\mu$ is unknown and to be estimated by the sample mean $\bar X.$ And that $\sigma$ is known. Then usual 95% CI for $\mu$ is of the form $\bar X \pm 1.96 \sigma/\sqrt{n}.$

The usual terminology is that the 'standard error' is $\sigma/\sqrt{n}.$ That is, $Var(\bar X) = \sigma^2/n$ and $SD(\bar X) = \sigma/\sqrt{n}.$ Also, the 'margin of error' is $1.96\sigma/\sqrt{n},$ which is half the length of the CI.

If you want to balance the sample size $n$ against the margin of error $M$, this means that the key equation is $M = 1.96\sigma/\sqrt{n},$ which can solved to express $n$ in terms of a desired $M$ and the constant $\sigma.$

If the sample standard deviation $\sigma$ is unknown and estimated by the sample standard deviation $S,$ then the formula for the CI is $\bar X \pm t^*S/\sqrt{n}.$ This makes it a little more complicated to find $n$ in terms of $M$ and $\sigma$ because the 'probability factor' $t^*$ is chosen to cut 2.5% of the area from the upper tail of the (symmetrical) Student t distribution with $n-1$ degrees of freedom.

One complication is that $t^*$ also depends on $n.$ But you can start with $t^* \approx 2$ and iterate, if $n$ is small enough to give $t^*$ much larger than 2. Another complication is that $S$ is a random variable and will typically be different from one sample to the next. Thus, even if the exact value of $\sigma$ is unknown, it helps to have some idea of its size.