Rayleigh Distribution: MLE biased?

1.8k Views Asked by At

This is most of an exam question I am doing for revision- some parts I have completed, others I am not sure about.

We have $H$ the maximum height(depth?) of a river each year, modelled as a rayleigh distribution with $x>0$ and the parameter $a$ unknown: $$ f_a(x) = \frac{x}{a}\exp\left(-\frac{x^2}{2a}\right)$$ We have 8 years of data: $$ 2.5\quad1.8\quad2.9\quad0.9\quad2.1\quad1.7\quad2.2\quad2.8\quad$$ We let these be observations $X_i$ and also know $\sum X_i=16.9$ and $\sum X_i^2 = 38.69$ and $\sum \sqrt{X_i}=11.48$.

Part 1 Calculate the CDF; this is $1-\exp{\frac{-t^2}{2a}}$

Part 2 What does a hypothesis that a $6m$ flood only happens once every thousand years mean for $a$? I am very unsure about this I think this means the probability of $H$ being $6$ is $1/1000$. So we need to integrate $f_a(x)$ from 0 to 6, and set that to $0.999$? i.e. $F_a(6)<0.001$, so $$1-\exp\left(-\frac{36}{2a}\right)<0.001 \text{ and so } \exp\left(-\frac{18}{a}\right)<0.999$$ So $a>\frac{-18}{\ln0.999}=17 991$ approx. Is this how to do this part?

Part 3 Find the maximum likelihood estimatorfor $a$, I got $\hat{a}=(2n)^{-1}\sum x_i^2$, which I think is correct per wikipedia.

Is this biased? I think it isn't per wikipedia, but don't really know what it means or how to demonstrate it.

Part 4 Assuming $H^2/a\sim\chi^2_2$, let $a_0>a_1$ be two positive reals, let $\alpha\in]0,1[$. Now construct a Neyman-Pearson test for hypotheses $H_0:a=a_0$ and $H_1: a=a_1$. From this deduce an optimal test of level $\alpha$ for hypotheses $H0:a>a_0$ versus $H1:a<a_1$.

What I have so far is I need to compare likelihood functions as so: \begin{align} L(a_1)>KL(a_0)&\Leftrightarrow \frac{x}{a_1}\exp\left(-\frac{x^2}{2a_1}\right) > K\frac{x}{a_0}\exp\left(-\frac{x^2}{2a_0}\right)\\ &\Leftrightarrow K < \frac{a_1}{a_0}\exp\left(x^2\left( \frac{1}{2a_0}-\frac{1}{2a_1} \right)\right) \end{align}

I don't know how to continue and construct the test.

Part 5 Carry out a test with $\alpha=10%$ for the hypothesis in Part 2. We are given the R-output for qchisq with parameter 1 in ${0.05,0.1,0.9,0.95}$ and the second parameter in ${8,16}$ so 8 values total. Comment if you want me to add them all in.

Thanks for any help/explanation

2

There are 2 best solutions below

0
On

Part 2: I think the inequality is slightly incorrect, assuming "6m flood only happens once every thousand years" means "6m or greater floods happen no more frequently than once in 1000 years": instead of $F(6) < 0.001$ it should be $F(6) \leq 0.999$ and, thus, $a \leq 6/(\ln 10) \approx 2.6$.

Part 3: An estimate is not biased if its expected value is equal to estimated parameter's value (https://en.wikipedia.org/wiki/Estimator#Bias). In this case, $\mathsf{E} (2 n)^{-1} \sum_{i=1}^n H_i^2 = (2 n)^{-1} n \mathsf{E} H^2 = 2^{-1} \cdot 2 a = a$ (where $H_i$ are $n$ i.i.d. random variables with specified distribution), so it is, indeed, unbiased.

Part 4: The next step is to calculate $K$ as function of $\alpha$, so that the probability of observation results belonging to rejection region if null hypothesis is true $\Pr(\{x: L(x, a_1) > K L(x, a_0\}; a_0) = \alpha$. Note that it depends on an array of observation results, not just a single one.

0
On

$$ \operatorname{E}\left( \frac 1 {2n} \sum_{i=1}^n X_i^2 \right) = \frac 1 {2n} \sum_{i=1}^n \operatorname{E}(X_i^2) = \frac 1 {2n} n\operatorname{E}(X_1^2), \tag 1 $$ so showing unbiasedness just requires finding that last expected value. \begin{align} \operatorname{E}(X_1^2) & = \int_0^\infty x^2 f(x)\,dx = \int_0^\infty x^2 \frac{x}{a}\exp\left(-\frac{x^2}{2a}\right) \, dx \\[10pt] & = \int_0^\infty x^2 \exp\left( -\frac{x^2}{2a} \right) \left( \frac x a \, dx \right) \\[10pt] & = \int_0^\infty 2au \exp(-u)\,du \quad\text{where }u = \frac{x^2}{2a} \\[10pt] & = 2a\int_0^\infty u e^{-u}\,du = 2a. \end{align} Plug that into $(1)$, getting $\dfrac 1 {2n} \cdot n\cdot 2a = a$, so this is indeed unbiased.

In part 2, you have $$ \exp\left(-\frac{18}{a}\right)<0.999 $$ $$ \frac{-18} a < \log 0.999 $$ $$ \frac{18} a > - \log 0.999 $$ $$ \frac a {18} < \frac{-1}{\log0.999} $$ $$ a < \frac{-18}{\log 0.999}. $$

You seem to have $a_0>a_1$. You seem to be using likelihood functions based only on a single observation. You need $$ L(a) = \frac{\prod_{i=1}^n x_i}{a^n} \exp\left( \frac{-\sum_{i=1}^n x_i^2}{2a} \right) $$ Thus your inequality $L(a_1)>KL(a_0)$ becomes

$$ K < \frac{a_1^n}{a_0^n}\exp\left( \left(\frac{1}{2a_0}-\frac{1}{2a_1}\right) \sum_{i=1}^n x_i^2 \right) $$ Since $\dfrac 1 {2a_0} - \dfrac 1 {2a_1}<0$ and $\exp$ is an increasing function, the solution occurs when $\sum_{i=1}^n x_i^2 < c$, for some $c$. Find the value of $c$ for which the sum is less than $c$ with probability equal to the level of your test.