Does This Equation Have a Closed Form Solution?

326 Views Asked by At

In a previous question I asked (Lagrange Method With Random Variables), someone suggested to me (in the comments) that I can use Maximum Likelihood Estimation to arrive at the same solution for an Optimization Problem involving the Lagrange Method (https://en.wikipedia.org/wiki/Inverse-variance_weighting).

Here is the original problem: Consider a generic weighted sum $Y=\sum_i w_i X_i$, where the weights $w_i$ are normalized such that $\sum_i w_i = 1$. If the $X_i$ are all independent, the variance of $Y$ is given by $\text{Var}(Y) = \sum_i w_i^2 \sigma_i^2$

Now, here is my attempt to arrive at the same solution using Maximum Likelihood Estimation (as per the suggestion in the comments):

Suppose $Y = \sum_{i} w_iX_i$, where $w_i$ are the weights, $X_i$ are independent random variables, and $\sum_{i} w_i = 1$. We want to find the optimal weights that minimize the variance of $Y$.

If $X_i$ follows a Normal Distribution with mean $\mu_i$ and variance $\sigma_i^2$, then we can define some new random variable $Y$ and write the distribution of $Y$ as: $$Y \sim N\left(\sum_{i} w_i\mu_i, \sum_{i} w_i^2\sigma_i^2\right)$$

Now, the logarithm of the likelihood function can be written as:$$\log(L) = \log\left(\prod_{i} \frac{1}{\sqrt{2\pi\sum_{j} w_j^2\sigma_j^2}} \exp\left(-\frac{1}{2\sum_{j} w_j^2\sigma_j^2}(y - \sum_{i} w_i\mu_i)^2\right)\right)$$

$$= -\frac{1}{2}\log(2\pi) - \frac{1}{2}\log\left(\sum_{j} w_j^2\sigma_j^2\right) - \frac{1}{2\sum_{j} w_j^2\sigma_j^2}(y - \sum_{i} w_i\mu_i)^2$$

To find the maximum likelihood estimate, we differentiate (using chain rule) the log-likelihood with respect to $w_i$ and set it to zero:

$$\frac{\partial}{\partial w_{i}} \log (L) = -\frac{w_i\sigma_i^2}{(\sum_j w_j^2\sigma_j^2)^2} + \frac{(\sum_j w_j^2\sigma_j^2)(y - \sum_{i} w_{i} \mu_{i})\mu_i + (y - \sum_{i} w_{i} \mu_{i})^2w_i\sigma_i^2}{(\sum_j w_j^2\sigma_j^2)^2} = 0$$

But I am not sure if the above system of likelihood equations has a closed form solution for $w_i$. But perhaps this system of likelihood equations might have a closed form solution if all $\mu_i$ are equal to each other?

My Question: Can someone please tell me if I am doing this correctly? Can Maximum Likelihood Estimation really be used to arrive at the same optimal solutions for $w_i$ as compared to the Lagrange Method?

Thanks!

  • Note:

  • If $Y=\sum_i w_i X_i$, then I don't think that variance of $Y$ is given by $\text{Var}(Y) = \sum_i w_i^2 \sigma_i^2$ unless each $X_i$ has the same mean $\mu$ .

  • This suggests that the "objective function" (i.e. $\text{Var}(Y)$ being optimized would actually be a completely different function with a completely different optimal solution of $w_i$

  • I think actually asked a question about this same point over here: Simplifying the Formulas for Weighted Means

2

There are 2 best solutions below

0
On

For looking at the Wikipedia article you mention, I believe the maximum likelihood estimate they are referring to is as follows. Assume you have a set of random independent variables, $X_{1},X_{2},...,X_{n}$ which have a common unknown mean, $\mu$, and known but potentially asymmetric variances $\sigma_{i}^{2}$. If it is further assumed that $X_{i}\sim N\left(\mu,\sigma_{i}^{2}\right)$, then we can work out the maximum likelihood estimator of $\mu$ conditional on a realized set of data $\mathbf{x}=\left(x_{1},...,x_{n}\right)$.

\begin{align} L=\prod_{i=1}^{n}\left(\frac{1}{\sigma_{i}\sqrt{2\pi}}e^{-0.5\left(\frac{x_{i}-\mu}{\sigma_{i}}\right)^{2}}\right) \end{align}

with the log likelihood given by

\begin{align} \mathrm{log}(L)=\sum_{i=1}^{n}\left(-\frac{1}{2}\left(\frac{x_{i}-\mu}{\sigma_{i}}\right)^{2}-\mathrm{log}(\sigma_{i}\sqrt{2\pi})\right) \end{align}

Maxizimsing with respect to $\mu$ gives

\begin{align} \sum_{i=1}^{n}\left(\frac{x_{i}-\mu}{\sigma_{i}^{2}}\right)=0 \end{align}

So the maximum likelihood estimator of $\mu$ is given by

\begin{align} \mu=\left(\sum_{i=1}^{n}\frac{x_{i}}{\sigma_{i}^{2}}\right)/\left(\sum_{i=1}^{n}\frac{1}{\sigma_{i}^{2}}\right) \end{align}

Which is the weighting that was arrived at under the Lagrange method.

0
On

If $X_1,...,X_n$ are iid then minimizing the $Var(Y) = Var(\sum_iw_iX_i)$ with $\sum_i w_i = 1$ is same as doing an un-biased estimation of the mean $\mu$ i.e., $E(X_1) = E(X_2) = ... = E(X_n) = \mu$ from $X_1,...,X_n$ while minimizing variance of the estimator. This is called Minimum Variance Unbiased Estimator (MVUE). Also you are doing a Linear estimator. In general, Linear-(MVUE) need not be same as ML estimator always but asymptotically as $n \rightarrow \infty$, ML estimator has the same optimal variance as MVUE. For $X_i$ normal distribution, it turns out both these estimators are same even for finite $n$.