Lagrange Method With Random Variables

61 Views Asked by At

I am reading this article on Wikipedia (https://en.wikipedia.org/wiki/Inverse-variance_weighting) where the Lagrange Method is used alongside Random Variables (to calculate the Optimal Weights of a Weighted Mean):

Consider a generic weighted sum $Y=\sum_i w_i X_i$, where the weights $w_i$ are normalized such that $\sum_i w_i = 1$. If the $X_i$ are all independent, the variance of $Y$ is given by $\text{Var}(Y) = \sum_i w_i^2 \sigma_i^2$

For optimality, we wish to minimise $Var(Y)$ which can be done by equating the gradient with respect to the weights of $Var(Y)$ to zero, while maintaining the constraint that $\sum_i w_i = 1$. Using a Lagrange multiplier $w_0$ to enforce the constraint, we express the variance:

$$Var(Y) = \sum_i w_i^2 \sigma_i^2 - w_0(\sum_i w_i - 1)$$

For $k > 0$, $0 = \frac{\partial}{\partial w_k} Var(Y) = 2w_k\sigma_k^2 - w_0$ which implies that $w_k = \frac{w_0/2}{\sigma_k^2}$

The main takeaway here is that $w_k \propto 1/\sigma_k^2$

Since $$\sum_i w_i = 1$$, $$\frac{2}{w_0} = \sum_i \frac{1}{\sigma_i^2} := \frac{1}{\sigma_0^2}$$

The individual normalised weights are $w_k = \frac{1}{\sigma_k^2}\left( \sum_i \frac{1}{\sigma_i^2} \right)^{-1}$

It is easy to see that this extremum solution corresponds to the minimum from the second partial derivative test by noting that the variance is a quadratic function of the weights.

Thus, the minimum variance of the estimator is then given by:

$$Var(Y) = \sum_i \frac{\sigma_0^4}{\sigma_i^4}\sigma_i^2 = \sigma_0^4\sum_i \frac{1}{\sigma_i^2} = \sigma_0^4\frac{1}{\sigma_0^2} = \sigma_0^2 = \frac{1}{\sum_i 1/\sigma_i^2}$$

My Question: Supposedly for the above derivation to be valid, all $X_i \sim N(\mu, \sigma_i^2)$ .However, I do not see or understand why this requirement is needed. No where in the above derivation do we seem to invoke the Normal Distribution or equal $\mu$ for all $X_i$.

Can someone please explain why this requirement is needed? Or is this requirement not needed?

Thanks!

  • Note 1: I think this requirement for the Population level parameters might not be needed (e.g. $\mu$ and $\sigma$), but this requirement might be needed for Sample level parameters (e.g. $\bar{x}$ and $s$)?

  • Note 2: However, if we have $\bar{x}$ and $s$ : I have a feeling we might not be able to use the Lagrange Method and the optimal weights might come out as something different?