Suppose I have a sample of people of size $n$ in which the probability that one smokes is p. I am asked what n should be so that the proportion of smokers in the samples is, in approximation of 0.01, near p, with the probability of 0.95.
I saw one answer and didn't understand one of the infers:
Let $S_n$ represent the number of smokers. I look for n that satisfies: $P(|{S_n\over n}-p|\le 0.01)\ge 0.95$. Now it is equivalent to $P({0.01\sqrt{n}\over \sqrt{p(1-p)}} \le{S_n\over \sqrt{np(1-p)}}\le {0.01\sqrt{n}\over \sqrt{p(1-p)}})\ge 0.95$. But then it is said that n should satisfy: ${0.01\sqrt{n}\over \sqrt{p(1-p)}}\ge 1.96$. Why?, if so, then by the normal distribution table, I get that the probability isn't 0.95. What am I missing>? I would really appreciate any sort of help.
You should get from the normal distribution table that $\int_{-\infty}^{1.96} \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx = \Phi(1.96)\approx 0.975$. This means that:
$P[-x \leq \frac{S_n}{\sqrt{np(1-p)}} \leq x] = \Phi(x) - \Phi(-x) = 2\Phi(x) - 1 = 0.95 \iff \Phi(x) = 0.975 \iff x = 1.96$
Summary: don't forget there are two tails. Also: it's easier to draw it than follow my ugly notation.