Mean squared error for vectors

3.6k Views Asked by At

I know that when we compare estimators $\hat{b_1}$ and $\hat{b_2}$ to an unknown parameter $\beta$, in classical statistics an estimator $\hat{b_1}$ is said to be "better" than $\hat{b_2}$ if:

$$ MSE(\hat{b_1}) \leq MSE(\hat{b_2}) $$ where MSE is the mean squared error: $$ MSE(\hat{b_1}) = E((\hat{b_1}-\beta)^2 )$$

Now if I had a vector $ \boldsymbol{b} =(b_1,b_2,\ldots b_n)$ of parameters to estimate, how could I compare estimators in terms of the MSE? Because there is no unique ordering relation in vectors.

I know some people compare component by component of both estimators, yet I seem to find no bibliography for that. Could you guys help me figure out a bibliography for that?

2

There are 2 best solutions below

0
On

In practical applications (engineering), the error (noise) vector is given by the modulus of the difference, in case taken relatively to the modulus of the reference vector, I.e. $$ \varepsilon = {{\left| {\Delta {\bf v}} \right|} \over {\left| {{\bf v}_{\,ref} } \right|}} $$

Error_Vector_1

That means that you are considering the error as the distance between the vectors' tips.

In certain situations (when the focus is on energy) it may be of interest to distinguish between the normal (out-of-phase) and the parallel (in-phase) component of the error.

0
On

Note that if $\hat \theta(X)$ is an estimator (depending on random data $X$) for the parameter $\theta\in \mathbb{R}^n,$ the MSE is a scalar quantity defined as

$$\begin{align}MSE(\hat\theta,\theta)&\equiv E[\|\hat\theta(X)-\theta\|^2]\\ &=E[(\hat\theta(X)-\theta)'(\hat\theta(X)-\theta)].\\\end{align}$$

With some matrix algebra, one can easily prove the identity

$$\begin{align}MSE(\hat\theta,\theta)&=\|Bias(\hat\theta,\theta)\|^2+tr(Var(\hat\theta(X))),\\ Bias(\hat\theta,\theta)&\equiv E[\hat\theta(X)]-\theta. \end{align}$$

So rather than look at a vector of individual MSEs, we typically look at the above metric as the generalization of MSE.


However, the MSE is only one metric to judge an estimator by. One may also be interested in looking at the variance-covariance matrix $Var(\hat\theta(X))$, in which case your question still stands, namely how do we decide which of $V_1\equiv Var(\hat\theta_1(X))$, $V_2\equiv Var(\hat\theta_2(X))$ is "greater" given two estimators $\hat\theta_1(X),\hat\theta_2(X)$?

A common partial order used in this respect that is defined on the set of symmetric positive semidefinite matrices is the Loewner order: $$V_1\geq V_2\iff V_1-V_2 \text{ is positive semidefinite (p.s.d)}.$$

Being a partial order, this relation cannot be used to compare any two variance-covariance matrices summoned from the ether, but it is still meaningful. For instance, because p.s.d matrices have nonnegative diagonal entries, one immediate implication of $V_1\geq V_2$ is that the variance of each component of $\hat\theta_1(X)$ is at least as great as the variance of the corresponding component of $\hat\theta_2(X).$