Comparing Confidence Given by Concentration Inequalities and Central Limit Theorem

72 Views Asked by At

Given an i.i.d. sample $X_1,\dots, X_n$ from Ber($p$), by Chebyshev's inequality,we have $$\text{Pr}(|\overline{X}-p|\le \epsilon)\ge 1-\frac{p(1-p)}{n\epsilon^2}.$$ Therefore, if $\text{Pr}(|\overline{X}-p|\le \epsilon)\ge 1-\alpha$, then $$n\ge \frac{p(1-p)}{\alpha\epsilon^2}.$$ Since $p(1-p)\le 1/4$, we have $$n\ge \frac{1}{4 \alpha\epsilon^2}.$$ Therefore, for example, when $\alpha=0.05$ and $\epsilon = 0.01$, $$n\ge 50,000$$

On the other hand, by Central Limit Theorem $$\bar{X} \overset{d}{\approx} \mathcal{N}\left(p,\,\frac{p(1-p)}{n}\right)$$ Therefore $$P\left(\bar{X} -z_{\alpha/2} \frac{\sqrt {p(1-p)}}{\sqrt n}<p< \bar{X} + z_{\alpha/2} \frac{\sqrt {p(1-p)}}{\sqrt n} \right) \approx 1 - \alpha.$$ Therefore, using $p(1-p)\le 1/4$, the number of samples required in this case is $$n \geq \frac{(z_{\alpha/2})^{2}}{4 \varepsilon^{2}}$$ Thus when $\alpha=0.05$ and $\epsilon = 0.01$, $n\ge 9,604$ only! Though we gain here in finding the optimal number of samples required to estimate $p$, we lose in the normal approximation by the use CLT. Though the answer given by the use of Chebyshev's inequality is not optimal one, it doesn't involve any approximation. Though using Chernoff bound etc, we may improve the result by Chebyshev, nothing can beat the one by the use of CLT. My question is, whether there's a way to compare the loss in normal approximation with the loss in using a concentration inequality such as Chebyshev or Chernoff?