Is this a general form for the confidence interval of a uniform distribution?

75 Views Asked by At

Let $X_1, \ldots, X_n$ a random sample with $X_i \sim \mathcal{U}[0, \theta]$ (where $\mathcal{U}$ = uniform dist). Let $Y = \max(X_1, \ldots, X_n)$, the MLE of $\theta$. It can be proven that $U = \frac{Y}{\theta}$ has density

\begin{align*} f_U(u) = \begin{cases} n u^{n-1} & 0 \leq u \leq 1 \\ 0 & \text{otherwise} \end{cases} \end{align*}

Evidently, $P(A \leq \frac{Y}{\theta} \leq B)$ is given by

\begin{align*} F_U(B) - F_U(A) \end{align*}

Then, if $\mathcal{P}(A, B) := P(A \leq \frac{Y}{\theta} \leq B)$, we have

\begin{align*} \mathcal{P}(A, B) &= n \left[\int_0^{B} u^{n-1} ~ du - \int_0^{A} u^{n-1} \right] \\ &= n \left[ \frac{B^n}{n} - \frac{A^n}{n} \right] \\ &= B^n - A^n \end{align*}

From this follows that $\mathcal{P}(\alpha^{\frac{1}{n}}, 1) = 1 - \alpha$. I want to use this to form confidence intervals for $\theta$, but I am unsure about whether my procedure to do this is correct. It takes simple manipulations to show that

\begin{align*} P(\alpha^{\frac{1}{n}} \leq \frac{Y}{\theta} \leq 1) = P( Y \leq \theta \leq \frac{Y}{\alpha^{\frac{1}{n}}}) = 1 - \alpha \end{align*}

Thus, the expression seems to serve to produce the CI $[Y, Y \alpha^{-\frac{1}{n}}]$ with confidence $1-\alpha$. For example, for $\alpha = 1/2 $, we have that $\theta$ will belong to $[Y, \frac{Y}{2^{\frac{1}{n}}}]$. Since $2^{\frac{1}{n}} \to 1$ as $n \to \infty$, with $n$ sufficiently large this means $\theta \in [Y, Y + \epsilon]$ for a very marginal $\epsilon$, with probability $95\%$.

Is this correct? If not, how can one use the derivations presented to form confidence intervals for the real value of $\theta$?

2

There are 2 best solutions below

1
On BEST ANSWER

Your derivation seems correct except that you wrote $2^{\frac1n}$ instead of $\left(\frac12\right)^{\frac1n}$. The conclusion is correct, though: The more data you have, the closer the maximum is likely to be to $\theta$, and thus the narrower the interval you need for $95\%$ confidence.

5
On

Actually, to reach the conclusion that a confidence interval involving the maximum order statistic $Y_n$ (the ML estimator of $\theta$) of the form $\color{blue}{\left (Y_n, Y_n+\epsilon \right)}$ is the best one, we need more work because one can construct different types of confidence intervals based on $Y_n$.

As $$\frac{Y_n}{\theta} \sim Beta(n,1),$$

we have for any $\beta \in [0,1]$

$$\mathbb P \left ( b_{1-\alpha (1-\beta)} \le \frac{Y_n}{\theta} \le b_{\alpha \beta } \right)=1-\alpha$$

where for any $w \in [0,1]$, $b_w$ denotes the upper $w$ percentile of $Beta(n,1)$, defined as

$$F_{Beta(n,1)}(b_w)=1-w.$$

Considering $F_{Beta(n,1)}(x)=x^n$ for $0 \le x \le 1$,

$$b_{\alpha \beta}=\left(1-\alpha \beta \right)^{\frac{1}{n}}$$

$$b_{1-\alpha (1-\beta)}=\left(\alpha (1-\beta) \right)^{\frac{1}{n}}.$$

Hence, we obtain the following confidence interval:

$$\left ( \frac{Y_n}{\left(1-\alpha \beta \right)^{\frac{1}{n}} }, \frac{Y_n}{\left( \alpha (1-\beta) \right)^{\frac{1}{n}}} \right).$$

For $\beta=0, \frac{1}{2}, 1,$ we get the following confidence intervals:

$$\left ( Y_n , \frac{Y_n}{\alpha ^{\frac{1}{n}}} \right)$$

$$\left ( \frac{Y_n}{\left(1-\frac{\alpha}{2} \right)^{\frac{1}{n}}}, \frac{Y_n}{\left(\frac{\alpha}{2} \right)^{\frac{1}{n}}} \right),$$

$$\left ( \frac{Y_n}{\left(1-\alpha \right)^{\frac{1}{n}}}, \infty \right)$$

These are of the forms $\color{blue}{\left (Y_n, Y_n+\epsilon \right)}$, $\color{blue}{\left (Y_n+\epsilon_1, Y_n+\epsilon_2 \right)}$, and $\color{blue}{\left (Y_n+\epsilon, \infty \right)}$, respectively.

The parameter $\beta$ is often set in a way that the expected length of the confidence interval:

$$\mathbb E(Y_n) \left [ \left( \alpha (1-\beta) \right)^{-\frac{1}{n}} - \left(1-\alpha \beta \right)^{-\frac{1}{n}} \right ],$$

is minimized. Following this approach and considering that the above is increasing in $\beta$ (the derivative is negative), $\beta=0$ is the global minimizer, and thus $\color{blue}{\left (Y_n, Y_n+\epsilon \right)}$ is the most preferred interval.