I have a question I need help on:
A supermarket manager samples n = 50 customers and if the true fraction of customers who dislike the policy is approximately .9, find the probability that the sample fraction will be within .15 unit of the true fraction
From this information, we know for the binomial distribution that n = 50 and p = 0.9. If I am correct, the sample proportion (we'll call it p') can be approximated as $N(p,\sqrt{p(q)/n})$. Therefore, it follows that what we are trying to find is the probability that $p(|p' - p)| \le 0.15)$ = $P(-0.15 \le p' - p \le 0.15)$
At this point, I'm kind of stuck. I know that I need to get Z where p' - p is, but I'm not sure what to do. If I understand correctly, the sample proportion can be represented as $\dfrac{Y}{n}$, leaving $P(-0.15 \le \dfrac{Y}{n} - p \le 0.15)$ but I'm not sure I understand why. Would someone be able to help me work through this? I really appreciate your help in advanced! Thanks!
However, at this point, I'm kind of stuck. I know that I need to get Z where p' - p is, but I'm not sure what to do. Would someone be able to help me work through this? I really appreciate your help in advanced! Thanks!
The idea of the normal approximation to the binomial is to use a normal distribution whose mean and variance equals that of the binomial distribution mean and variance; i.e., if the sample is drawn from a random variable $X \sim \operatorname{Binomial}(n,p)$, then $$X \overset{\circ}{\sim} \operatorname{Normal}(\mu = np, \sigma^2 = np(1-p)).$$ Equivalently, we may express this approximation in terms of the sampling distribution of the sample proportion, e.g., $$\hat p = \frac{X}{n} \overset{\circ}{\sim} \operatorname{Normal}\left(\mu = p, \sigma^2 = \frac{p(1-p)}{n} \right).$$ This approximation then allows us to compute relevant probabilities using normal distribution tables. In your case, $n = 50$, $p = 0.9$, $\sigma = \sqrt{(0.9)(0.1)/50} \approx 0.0424264$, and the desired probability is the probability that the sample proportion $\hat p$ is within $0.15$ units of the true proportion $p$: $$\Pr[|\hat p - p| < 0.15] = \Pr[-0.15 < \hat p - 0.9 < 0.15].$$ If we now standardize by writing the above probability as $$\Pr\left[ \frac{-0.15}{0.0424624} < \frac{\hat p - \mu}{\sigma} < \frac{0.15}{0.0424624} \right],$$ we see the random variable in the center of this inequality is a standard normal variable with mean $0$ and standard deviation $1$. Thus the above probability is simply $$\Pr[-3.53553 < Z < 3.53553] = 1 - 2 \Pr[Z > 3.53553] \approx 0.999593.$$
However, the exact probability corresponds to the quantity $$\begin{align*} \Pr[0.9 - 0.15 < X/n < 0.9 + 0.15] &= \Pr[0.75(50) < X] \\ &= \Pr[X \ge 38] \\ &= \sum_{x=38}^{50} \binom{50}{x} (0.9)^x (1 - 0.9)^{50-x} \\ &\approx 0.998995. \end{align*}$$ This sum, which contains 13 terms, is difficult to calculate except with a computer or very sophisticated calculator. Note that the exact probability is smaller than the normal approximation value; this is due to the fact that the sample proportion cannot actually take on continuous values, but is discrete because the number of dissatisfied customers in a sample of size $n = 50$ is necessarily an integer between $0$ and $50$; thus in order for the sample proportion to be within $0.15$ of the true proportion of $p = 0.9$, this means $X > 37.5$; but $X$ cannot actually be, say, $37.7$--hence the rounding up to $X \ge 38$. The upper bound is ignored because if $0.9 + 0.15 = 1.05 > 1$, so in effect, the upper bound is $X \le 50$. So on both ends of the interval, the normal approximation includes a tiny bit of "extra" probability that is not counted using the exact binomial distribution.