Find a $95%$ confidence interval for $θ$ based on inverting the test statistic statistic $\hat{θ}$

778 Views Asked by At

So I am to define a 95% confidence interval for unknown parameter $θ$ by inverting the MLE of it.

For our data we have

$$Y_i \sim N(θx_i,1) \quad \text{for} \quad i=1,\ldots,n.$$

Therefore it can be proven that the MLE for $θ$ is given by

$$\hat{θ} = \frac{\sum x_i Y_i}{\sum x_i^2}$$

To find the confidence interval I should invert the test statistic $\hat{θ}$.

The most powerful unbiased size $α = 0,05$ test for testing

$$H_0: μ = μ_0 \quad \text{vs.} \quad H_1: μ ≠ μ_0$$

where $X_1,\ldots,X_n ~ \text{ iid } n(μ,σ^2)$ has acceptance region

$$A(μ_0) = \left\{\mathbf{x}: |\bar{x} - μ_0| ≤ 1,96σ/\sqrt{n}\right\}.$$

Substituting my problem (I think) we get that the most powerful unbiased size $α = 0,05$ test for testing

$$H_0: θ = \hat{θ} \quad \text{vs.} \quad H_1: θ ≠ \hat{θ}$$

has acceptance region

$$A(\hat{θ}) = \{\mathbf{y}: |\bar{y} - \hat{θ}| ≤ 1,96/\sqrt{n}\}$$

or equivalently,

$$A(\hat{θ}) = \left\{\mathbf{y}: \frac{\sqrt{n}\bar{y} - 1,96}{\sqrt{n}x_i} ≤ \hat{θ} ≤ \frac{\sqrt{n}\bar{y} + 1,96}{\sqrt{n}x_i} \right\}$$

Substituting $\hat{θ} = \sum x_iYi/\sum x_i^2$ we obtain

$$A(\hat{θ}) = \left\{\mathbf{y}: \frac{\sqrt{n}\bar{y} - 1,96}{\sqrt{n}x_i} ≤ \frac{Σx_iYi}{Σx_i^2} ≤ \dfrac{\sqrt{n}\bar{y} + 1,96}{\sqrt{n}x_i}\right\}$$

This means that my $1 - 0,05 = 0,95$ (95%) confindence interval is defined to be

$$ C(y) = \{\hat{θ}: y ∈ A(\hat{θ})\}$$

But I can't seem to find anything concrete and I feel that I've made mistakes somewhere. What to do?

Reference for the question: http://people.unica.it/musio/files/2008/10/Casella-Berger.pdf (Section 9.2.1)

1

There are 1 best solutions below

6
On BEST ANSWER

You have $$ \widehat \theta = \frac{\sum_i x_i Y_i}{\sum_i x_i^2}. $$ Although non-linear in $(x_1,\ldots,x_n)$, this is linear in $(Y_1,\ldots,Y_n)$, and that's why it's a "linear" model (it's not because one is fitting a straight line; if that were true then least-squares fitting of polynomials would be considered non-linear regression, but it's linear regression). So you have a linear combination of independent normally distributed random variables, where the coefficients are constant (i.e. not random). Being a linear combination of independent normally distributed random variables with constant coefficients makes its distribution easy to find: \begin{align} \widehat\theta \sim N\left( \frac{\sum_i (x_i \operatorname{E}(Y_i))}{\sum_i x_i^2} , \frac{\sum_i \big(x_i^2 \operatorname{var}(Y_i)\big)}{\left( \sum_i x_i^2 \right)^2} \right) & = N \left( \theta, \frac 1 {\sum_i x_i^2} \right) \\ & \text{since } \operatorname{E}(Y_i) = x_i\theta \text{ and } \operatorname{var}(Y_i) = 1. \end{align}

So $$ \Pr \left( \theta - A \frac 1 {\sqrt{\sum_i x_i^2}} < \widehat\theta < \theta + A \frac 1 {\sqrt{\sum_i x_i^2}} \right) = 0.95 $$ where $A$ is such that $\Pr( -A<Z<A ) = 0.95$. Consequently $$ \Pr \left( \widehat \theta - A \frac 1 {\sqrt{\sum_i x_i^2}} < \theta < \widehat \theta + A \frac 1 {\sqrt{\sum_i x_i^2}} \right) = 0.95 $$ and that is your confidence interval.