Deriving confidence interval for Bernoulli proportion

51 Views Asked by At

I want to derive from scratch $(1 - \alpha)$ CI for Bernouli proportion. My result differs from the well known result I'm not sure why:

My derivation

We are looking for such $l, r$ that $P(\theta \in [l, r]) = 1 - \alpha$. We know that

$$\frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}} \rightarrow N(0, 1)$$

Now:

$$P\left(l \le \frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}} \le r\right) = P\left(\frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}}\le r\right) - P \left(\frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}} \le l\right)$$

I divide those two expression with say $\alpha_1, \alpha_2$ such that $\alpha_1 + \alpha_2 = 1-\alpha$

$$P\left(\frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}}\le r\right) = \alpha_1$$

$$ P \left(\frac{\sqrt n (\overline X_n - \theta)}{\sqrt{\theta(1 - \theta)}} \le l\right) = \alpha_2$$

From those two equalities we obtain that $r = z_{\alpha_1}, \; l = z_{\alpha_2}$ where $z_y$ is quantile function of standard normal distribution in point $y$. Then using such $l$ and $r$ I'll finally obtain CI:

$$\left[\overline X_n - z_{\alpha_2}\sqrt{\frac{\theta(1 - \theta)}{n}}, \overline X_n - z_{\alpha_1}\sqrt{\frac{\theta(1 - \theta)}{n}}\right]$$

Whereas the normal result that should be obtained is:

$$\left[\overline X_n - z_{1 - \frac \alpha 2}\sqrt{\frac{\theta(1 - \theta)}{n}}, \overline X_n + z_{1 - \frac \alpha 2}\sqrt{\frac{\theta(1 - \theta)}{n}}\right]$$

Could you please explain to me from where this difference comes from?

1

There are 1 best solutions below

0
On

First, you have a small algebra error where you say $\alpha_1 + \alpha_2 = 1 - \alpha$. If you use the notation you have for those probabilities, it should be $\alpha_1 - \alpha_2 = 1 - \alpha$.

Second, your confidence interval may be easier to interpret as $$ \left[ \overline{X} + z_{\alpha_2} \frac{\sqrt{\theta(1-\theta)}}{n}, \overline{X} + z_{\alpha_1} \frac{\sqrt{\theta(1-\theta)}}{n} \right].$$ Let the minus sign come from the choice of $\alpha_2$, not inherently in the set. So just rename the variables and the negative sign can pop out below. We would now like to show that $z_{\alpha_2} = -z_{1 - \frac{\alpha}{2}} = z_{\frac{\alpha}{2}}$ and that $z_{\alpha_1} = z_{1 - \frac{\alpha}{2}}.$

Third, you have made the key observation that there are infinitely many confidence sets for any given $\alpha$, even with the heavy parametric restriction on the form of the confidence sets. If you give me $\alpha_1$ (with some constraints), I can give you $\alpha_2$ to match. So how do we choose one? Well it turns out if we minimize the length of the interval for this parametric restriction, then we get a unique one which is nice since we can calculate it every time. That is we can show the symmetry you want given the normality constraint.

Note: there is a paper by Pratt about Length of confidence intervals which is similar but even stronger than this result. In particular, he shows that any unimformly most powerful test can be inverted to get the confidence interval of smallest length. So this means finding this confidence interval of smallest length is good, if there is a uniformly most powerful test (and in the case of this parametric restriction, there is!).

Here is the approach. We want the interval $C(a,b) = [\overline{X} - a, \overline{X} + b]$ such that $P(C(a,b) \ni \theta) = 1 - \alpha$. Let $\phi$ be the standard normal pdf and $\Phi$ be the standard normal cdf. Using the normality condition you laid out, we get $$\begin{align*} P(C(a,b) \ni \theta) &= P(\overline{X} - a \leq \theta \leq \overline{X} - b)\\ &= P(-b \leq \overline{X} - \theta \leq a) \\ &= P\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}} \leq \frac{\sqrt{n}(\overline{X} - \theta)}{\sqrt{\theta(1-\theta)}} \leq \frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}} \right) \\ &= P\left(\frac{\sqrt{n}(\overline{X} - \theta)}{\sqrt{\theta(1-\theta)}} \leq \frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}} \right) - P\left(\frac{\sqrt{n}(\overline{X} - \theta)}{\sqrt{\theta(1-\theta)}}\leq -\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) \\ &=\Phi\left(\frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) - \Phi\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) \\ &=1 - \alpha \end{align*}$$

Thus we get the set $$\text{ValidEndpoints} = \{(a,b) \, : \, P(C(a,b) \ni \theta)\} = \left\{ (a,b) \, : \, \Phi\left(\frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) - \Phi\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) =1 - \alpha \right\}.$$

This set gives us the infinite possible confidence sets. Now we will choose just one of these by minimizing the length of the confidence interval.

This is our constraint set, and we want to minimize the length of $C(a,b)$ which is $a + b$. Thus we do Lagrangian method for $$ \min a + b \quad \text{s.t.} \quad (a,b) \in \text{ValidEndpoints}. $$ We get $$ \begin{align*} &L = a + b + \lambda\left(1 - \alpha - \Phi\left(\frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) + \Phi\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right)\right) \\ \implies & \frac{\partial L}{\partial a} = 1 - \frac{\lambda \sqrt{n}}{\sqrt{\theta(1-\theta)}}\phi\left(\frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) = 0 \\ & \frac{\partial L}{\partial b} = 1 - \frac{\lambda \sqrt{n}}{\sqrt{\theta(1-\theta)}}\phi\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) = 0 \\ & \frac{\partial L}{\partial \lambda} = 0 \end{align*}$$ Solving and dividing out lambda in the first two equations gives $$ \phi\left(\frac{a\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) = \phi\left(-\frac{b\sqrt{n}}{\sqrt{\theta(1-\theta)}}\right) \implies a = \pm b $$ using the symmetry of the normal pdf over the $y$-axis.

We can eliminate $a = -b$ since the confidence interval would have no area. Thus $a = b$, which gives you the symmetry you want. This combined with what you have above says that $z_{\alpha_2} = z_{\frac{\alpha}{2}} = -z_{1 - \frac{\alpha}{2}}$ and $z_{\alpha_1} = z_{1 - \frac{\alpha}{2}}$, as desired.