Finding the MLE of expected value of two successions of normals.

70 Views Asked by At

I have $X_1 ... X_{n_1}$ ~ Normal$(\mu, \sigma^2_1)$ and $Y_1 ... Y_{n_2}$ ~ Normal$(\mu, \sigma^2_2)$, with $\sigma^2_i$ known. All random variables are independent. I want to find maximum likelihood estimate of $\mu$.

Now, if I had just $X_i$, I would have said that $\bar{\mu} = \frac{\sum X_i}{n_1}$. However here I also have that $\bar{\mu} = \frac{\sum Y_i}{n_2}$. How can I combine this two results to get a better estimate of $\mu$?

I tried just writing $\bar{\mu} = \frac{1}{2} \big( \frac{\sum Y_i}{n_2} + \frac{\sum Y_i}{n_2}\big)$, but I don't think this leads to something correct.

Do you have any hints?

1

There are 1 best solutions below

1
On BEST ANSWER

Well, intuition will get you closer. If $n_1$ is much smaller than $n_2$, and $\sigma_1^2$ is greater than $\sigma_2^2$, then the variation arising from the sample coming from $X$ is going to be substantially greater than the variation in the sample from $Y$, and your MLE would need to reflect this. Presently, your choice does not, as it gives equal weight to the sample means from each distribution regardless of how many observations are in each group.

So, let's reason more formally by actually constructing the likelihood function and maximizing it. Note that the joint likelihood of the combined sample $\boldsymbol x = (x_1, \ldots, x_{n_1})$ and $\boldsymbol y = (y_1, \ldots, y_{n_2})$ is simply $$\mathcal L(\mu \mid \boldsymbol x, \boldsymbol y, \sigma_1, \sigma_2) \propto f_{\boldsymbol X, \boldsymbol Y}(\boldsymbol x, \boldsymbol y \mid \mu, \sigma_1, \sigma_2) = \prod_{i=1}^{n_1} \frac{e^{-(x_i - \mu)^2/(2\sigma_1^2)}}{\sqrt{2\pi}\sigma_1} \prod_{j=1}^{n_2} \frac{e^{-(y_j - \mu)^2/(2\sigma_2^2)}}{\sqrt{2\pi}\sigma_2}.$$ That is to say, the likelihood is proportional to the joint density of the sample. We can ignore any factors in $\mathcal L$ not dependent on $\mu$ as these are fixed with respect to $\mu$: $$\mathcal L(\mu \mid \boldsymbol x, \boldsymbol y, \sigma_1, \sigma_2) \propto \exp\left(-\frac{1}{2\sigma_1^2} \sum_{i=1}^{n_1} (x_i - \mu)^2 \right) \exp\left(-\frac{1}{2\sigma_2^2} \sum_{j=1}^{n_2} (y_j - \mu)^2 \right).$$ But we can write this likelihood in terms of $\bar x$ and $\bar y$ rather than the sample itself, if we partition the sum of squares like so: $$\begin{align*} \sum_{i=1}^{n_1} (x_i - \mu)^2 &= \sum_{i=1}^{n_1} (x_i - \bar x + \bar x - \mu)^2 \\ &= \sum_{i=1}^{n_1} \left((x_i - \bar x)^2 + 2(x_i - \bar x)(\bar x - \mu) + (\bar x - \mu)^2 \right) \\ &= n_1 (\bar x - \mu)^2 + \sum_{i=1}^{n_1} (x_i - \bar x)^2 + 2(\bar x - \mu)\sum_{i=1}^{n_1} (x_i - \bar x) \\ &= n_1 (\bar x - \mu)^2 + \sum_{i=1}^{n_1} (x_i - \bar x)^2. \end{align*}$$ The last equality is true because the last sum in the previous expression is zero (why?). The beauty of this partitioning is that now $$\exp\left(-\frac{1}{2\sigma_1^2} \sum_{i=1}^{n_1} (x_i - \mu)^2\right) = \exp\left(-\frac{n_1(\bar x - \mu)^2}{2\sigma_1^2} \right)\exp\left(-\frac{1}{2\sigma_1^2} \sum_{i=1}^{n_1} (x_i - \bar x)^2\right),$$ and the second $\exp$ factor, having no $\mu$, is constant with respect to $\mu$ and can be eliminated. Handling the sample from $Y$ similarly, we get a greatly simplified likelihood: $$\mathcal L \propto \exp\left( - \frac{n_1(\bar x - \mu)^2}{2\sigma_1^2} - \frac{n_2(\bar y - \mu)^2}{2\sigma_2^2}\right).$$ The log-likelihood is then $$\ell(\mu \mid \boldsymbol x, \boldsymbol y, \sigma_1, \sigma_2) = -\frac{n_1(\bar x - \mu)^2}{2\sigma_1^2} - \frac{n_2(\bar y - \mu)^2}{2\sigma_2^2}.$$ This, being a quadratic function in $\mu$, is easily maximized with respect to $\mu$ using the basic techniques of calculus. I leave the remainder of this computation to you as an exercise.


One final remark: It is important to note that the likelihood we computed above does presume that the variances are known, because we dropped a number of factors from the likelihood that are functions of $\sigma_1$ and $\sigma_2$; if one were to try to obtain a joint maximum likelihood estimator for $\mu$, $\sigma_1$, and $\sigma_2$, you cannot drop those factors since then the goal is to maximize $\mathcal L$ with respect to all three parameters simultaneously. This is a much more complicated computation and although I have not tried it, I strongly suspect it is not possible to obtain a closed form solution in such a case.