So I have solutions to two statistics questions below, but I don't quite understand where some values came from and was hoping someone could clarify. I bolded the steps I didn't understand and also left a comment at the start of each solution.
Q1:
An estimate of the percentage of the defectives in a lot of pins supplied by a vendor is desired to be within 1% of the true proportion at 90% confidence level.
(b) If the actual percentage of the defectives is unknown, what is the minimum sample size needed for the study?
The solution is below but I don't understand why p = 0.5? What makes that the worst case?
so worst case for the value of sqrt( p*(1-p)/n) is when p =0.5
CI_Low = p - Z_critical* sqrt( (p)* (1-p)/ n)
CI_High = p + Z_Critical sqrt( p(1-p)/n)
1.645*sqrt ( p*(1-p)/ n) = 0.01
sqrt( 0.5*0.5/n) = 0.01/1.645 = 0.006079027
0.25/n = 3.69546E-05
3.69546E-05*n = 0.25
n = 0.25/3.69546E-05 = 6765.0625
n = 6766
Q2:
A statistician estimates the 92% confidence interval for the mean of a normally distributed population as (162.75, 173.25) at the end of a sampling experiment assuming a known population standard deviation.
a. Use the information given to construct the 97% confidence interval for the population mean.
The solution for this is long so I'm not going to paste all of it, but I was wondering why the tails '4%' and '1.5%' need to be added to the critical z values? I tried searching online but I couldn't figure out what formula or rule this falls under?
CI_Low = mean - Z_critical*standard deviation/sqrt(N)
CI_High = mean + Z_critical*standard deviationa/sqrt(N)
mean = (162.75 +173.25) /2 = 168
92 % confidence range has 4 % tail on both sides
Z_critical = 0.96
Z_critical = 1.750686071
P(z< 1.75 ) = 0.9599
P(z< 1.76) = 0.9608
so for 97 % confidence range
97 % has 1.5 % tails on both sides
P(z< Z) = 0.985 gives Z_critical for that
P(z<2.17) = 0.9850
Thoughtful question. I will try to answer just the parts you put in bold type.
(1) Here is a plot of $p(1-p)$ against $p$ for $0 < p < 1.$
Notice that the maximum of this parabolic curve is at $p = 1/2.$ Consequently, the standard error $\sqrt{p(1-p)/n}$ is maximized when $p = 1/2.$
Because you don't know $p,$ you need to use the largest possible standard error in order to be sure the sample size $n$ is large enough.
(2) A 92% confidence interval is of the form $\bar X \pm 1.7507\frac{\sigma}{\sqrt{n}}.$ The bold statement has to do with how the numbers $\pm 1.7507$ are found. In order to have 92% of the probability under the standard normal curve below (in the central part), you need to put 4% of the probability in each tail--that is outside of the vertical dotted lines at each end of the curve.[Similarly, for a 95% CI, you put probability 2.5% in each tail; for a 90% CI you put 5% in each tail; and for a 97% CI you put 1.5% in each tail.]