Why can we replace $p$ with its estimate $\hat{p}$ and not lose the normality of the distribution?

57 Views Asked by At

This is from 'Introduction to Mathematical Statistics' by Hogg et al (8th Ed):

Example 4.2.3 (Large Sample Confidence Interval for $p$). Let $X$ be a Bernoulli random variable with probability of success $p$, where $X$ is $1$ or $0$ if the outcome is success or failure, respectively. Suppose $X_1, \ldots, X_n$ is a random sample from the distribution of $X$. Let $\hat{p} = \bar{X}$ be the sample proportion of successes. Note that $\hat{p} = \frac{1}{n} \sum_{i=1}^n X_i$ is a sample average, and that $\text{Var}(\hat{p}) = \frac{p(1 - p)}{n}$. It follows immediately from the CLT that the distribution of $Z = > \frac{\hat{p} - p}{\sqrt{\frac{{p}(1 - {p})}{n}}}$ is approximately $N(0, 1)$. Referring to Example 5.1.1 of Chapter 5, we replace $p(1-p)$ with its estimate $\hat{p}(1 - \hat{p})$. Then proceeding as in the last example, an approximate $(1 - \alpha)100\%$ confidence interval for $p$ is given by $(\hat{p} - z_{\alpha/2} > \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}, \hat{p} + z_{\alpha/2} > \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}})$.

I'm confused by the part 'Referring to Example 5.1.1 of Chapter 5, we replace $p(1-p)$ with its estimate $\hat{p}(1 - \hat{p})$.' Example 5.1.1, which is quoted below, seems irrelevant. Why can we replace $p$ with its estimate and not lose the normality of the distribution?

Example 5.1.1 (Sample Variance). Let $X_1, \ldots , X_n$ denote a random sample from a distribution with mean $\mu$ and variance $\sigma^2$. In Example 2.8.7, we showed that the sample variance is an unbiased estimator of $\sigma^2$. We now show that it is a consistent estimator of $\sigma^2$. Recall Theorem 5.1.1 which shows that $\overline{X}_n \xrightarrow{P} \mu$. To show that the sample variance converges in probability to $\sigma^2$, assume further that $\mathbb{E}[X_4] < \infty$, so that $\text{Var}(S^2) < \infty$. Using the preceding results, we can show the following:

[ S^2_n = \frac{1}{n - 1} \sum_{i=1}^n (X_i - \overline{X}n)^2 = \frac{n}{n - 1} \left( \frac{1}{n} \sum{i=1}^n X_i^2 - \overline{X}_n^2 \right) \xrightarrow{P} 1 \cdot \left[ \mathbb{E}(X^2_1) - \mu^2 \right] = \sigma^2. ] Hence the sample variance is a consistent estimator of $\sigma^2$.

From the discussion above, we have immediately that $S_n > \xrightarrow{P} \sigma$; that is, the sample standard deviation is a consistent estimator of the population standard deviation.

1

There are 1 best solutions below

0
On BEST ANSWER

He's alluding to the plug-in principle, which basically says that if some statistic $T$ is a consistent estimator for some property of the underlying distribution $\tau$, then you can substitute the test statistic $T$ in place of $\tau$ in calculations about that distributions.

The sample size at which this estimate is "good enough" depends on the underlying distribution and required tolerances.

In this case, the assumption is that $\hat p (1-\hat p)\approx p(1-p)$ by invoking the consistency of the sample variance estimator.