Finding the optimal weights to place on $n$ estimators to create a weighted-average that minimizes the expected squared error

92 Views Asked by At

Consider $n$ independent random variables; $X_1, X_2, ... X_n$, each of which are estimators of a criterion $Y$.

$X_1, X_2, ... X_n$ may have different (known) variances, and may have different (known) biases in relation to $Y$.

What weights $w_1, w_2, ... w_n$ should be placed on $X_1, X_2, ... X_n$ to create a weighted-average (i.e., an aggregate estimator) that minimizes the expected squared error?

A reference that deals with this (exact) problem, or better still, a demonstration of the solution, would be much appreciated. I have managed to find the solution for $n = 2$. However, the math for $n > 2$ becomes tedious - I suspect it requires the use of matrices.

1

There are 1 best solutions below

6
On

This is an interesting problem. Here is my attempt.

Write, $Z = w_1X_1 + \cdots + w_nX_n$.

We would like to solve, \begin{align*} \min_{w_i} \mathbb{E}[(Z-Y)^2] &= \min_{w_i} \mathbb{E}[Z^2] + \mathbb{E}[Y^2] - 2 \mathbb{E}[Z] \mathbb{E}[Y] \\&= \min_{w_i} \mathbb{V}[Z] + \mathbb{V}[Y] + (\mathbb{E}[Z] - \mathbb{E}[Y])^2 \end{align*}

Taking the derivative with respect to $w_i$ and setting it to zero we have, $$ w_i\mathbb{V}[X_i] + \left(\sum_j w_j \mathbb{E}[X_j] - \mathbb{E}[Y]\right) \mathbb{E}[X_i] = 0 $$ where we have used, $$ \mathbb{E}[Z] = w_1 \mathbb{E}[X_1] + \cdots + w_n \mathbb{E}[X_n] $$ and $$ \mathbb{V}[Z] = w_1^2 \mathbb{V}[X_1] + \cdots + w_n^2 \mathbb{V}[X_n] $$

Note that this is a linear system of equations in $w_i$: $ (\operatorname{diag}(V) + EE^T)w = \mathbb{E}[Y] E $ where $V$ and $E$ are the vectors of variances and means of $X$.

In matrix form, $$ \begin{bmatrix} \mathbb{V}[X_1]+\mathbb{E}[X_1]^2 & \mathbb{E}[X_1]\mathbb{E}[X_2] & \cdots & \mathbb{E}[X_1]\mathbb{E}[X_n] \\ \mathbb{E}[X_2]\mathbb{E}[X_1] & \mathbb{V}[X_2]+ \mathbb{E}[X_2]^2& \cdots & \mathbb{E}[X_2]\mathbb{E}[X_n] \\ \vdots & & \ddots\\ \mathbb{E}[X_n]\mathbb{E}[X_1] & \mathbb{E}[X_n]\mathbb{E}[X_2] & \cdots & \mathbb{V}[X_2]+\mathbb{E}[X_n]^2 \\ \end{bmatrix} \begin{bmatrix} w_1 \\ w_2\\\vdots \\ w_n \end{bmatrix} = \begin{bmatrix} \mathbb{E}[Y]\mathbb{E}[X_1]\\ \mathbb{E}[Y]\mathbb{E}[X_2]\\ \vdots \\ \mathbb{E}[Y]\mathbb{E}[X_n] \end{bmatrix} $$

So I guess if this system has a solution it should give a local extrema or saddle point to the minimization problem. Note that $V+EE^T$ is positive semidefinite which probably implies this is a minima (although my calculus is rusty).

In particular, if $\mathbb{V}[X_i] > 0$ for all $i$ then the system is positive definite so a unique solution is guranteed.