Statistics, least square method

350 Views Asked by At

I am having problems with an exercise. I have some observations of the random variable $Y$: $0.17, 0.06, 1.76, 3.41, 11.68, 1.86, 1.27, 0.00, 0.04,$ and $2.10$.

I know that $Y = X^2$ and that $X \sim \mathrm{N}(\mu, 1)$. Now I am supposed to estimate $\mu$ using the least square method.

I use the formula:

$$Q(\mu) = \sum_{i=1}^{10} (x_i - \mu)^.2$$

My solution is that since $Y = X^2 \Leftrightarrow X = \sqrt{Y}$, then using the formula I have

$$Q(\mu) = \sum_{i=1}^{10} (\sqrt{y_i} - \mu)^2 = \sum_{i=1}^{10} (\sqrt{y_i}^2 - 2 \mu \sum_{i=1}^{10} \sqrt{y_i} + \sum_{i=1}^{10} \mu^2$$

which gives me

$$(0.17 + 0.06 +\cdots +2.10)-2\mu (\sqrt{0.17}+\sqrt{0.06}+\cdots+\sqrt{2.10}) + 10\mu^2$$ and

$$Q'(\mu) = 20\mu - 2(\sqrt{0.17}+\sqrt{0.06}+\cdots+\sqrt{2.10}).$$

Setting this to zero (to minimize) gives me

$$\mu = 2(\sqrt{0.17}+\sqrt{0.06}+\cdots+\sqrt{2.10})/20 = 1.13880\ldots$$

However, the answer should be 1.111. Can you spot any obvious mistakes? I feel like I have double checked this so many times now and I still think it looks alright, so the only conclusion I can draw is that I have missed something important about how to use this method. Any help is appreciated.

1

There are 1 best solutions below

7
On BEST ANSWER

Let $\bar x =(x_1+\cdots+x_n)/n$ be the sample mean. Then \begin{align} \sum_{i=1}^n (x_i - \mu)^2 & = \sum_{i=1}^n ((x_i-\bar x) + (\bar x - \mu))^2 \\[10pt] & = \sum_{i=1}^n \Big( (x_i-\bar x)^2 + 2(x_i-\bar x)(\bar x-\mu) + (\bar x - \mu)^2) \\[10pt] & = \left( \sum_{i=1}^n (x_i-\bar x)^2 \right) + \left( 2 (\bar x - \mu) \underbrace{\sum_{i=1}^n (x_i-\bar x)}_\text{This sum is 0.} \right) + n(\bar x - \mu)^2 \\[10pt] & = n(\bar x - \mu)^2 + (\text{something not depending on }\mu). \end{align} This first expression in the display above --- the sum of squares of deviations from $\mu$ --- is thus shown to be equal to the last expression, an increasing function of the square of the absolute difference between the population mean $\mu$ and the sample mean $\bar x$. The last expression is clearly minimized by $\mu=\bar x$; therefore the sample mean $\bar x$ is the least squares estimator of the population mean $\mu$. And you appear to have computed it correctly.

Let us note, however, that $$ \operatorname{E} Y = \operatorname{E}(X^2) = \Big(\operatorname{E} X\Big)^2 + \operatorname{var}(X) = \mu^2 + 1. $$ The sample mean of the $Y$ values, which is the least-squares estimator of the population mean of $Y$, is $2.235$. So if we then solve $\mu^2+1=2.235$ for $\mu$, we get about $1.111306$, apparently coinciding with what you say is the correct value.

However, either way, the maximum-likelihood estimate of $\mu$ is just what you got, the mean of the $X$ values.