scaling a covariance matrix

6.1k Views Asked by At

I am looking at a paper where the covariance matrix needs to be scaled by another matrix where the scaling weights are on the diagonal elements. The formula for the scaled covariance matrix is written as:

$$ S = W \Sigma W^T $$

where $\Sigma$ is the original covariance matrix and $W$ is the weight matrix. I can verify this operation results in a valid covariance matrix but this operation is not intuitive to me. Is there an intuitive, geometric explanation of why we need to multiply by $W$ and $W^T$ rather than just having $W \Sigma$ as the scaled covariance matrix.

1

There are 1 best solutions below

2
On BEST ANSWER

$\newcommand{\var}{\operatorname{var}}\newcommand{\cov}{\operatorname{cov}}\newcommand{\E}{\operatorname{E}}$Suppose $X$ is a random vector taking values in $\mathbb R^n=\mathbb R^{n\times1}$ and $\mu=\E (X)$. Feller defined its "variance" to be the $n\times n$ matrix $$ \Sigma = \var(X) = \E((X-\mu)(X-\mu)^T). $$ Some call this the "covariance matrix" or just the "covariance" because its entries are the covariances between scalar components.

Notice that if $A$ is a $k\times n$ matrix then the random vector $AX$ takes values is $\mathbb R^k=\mathbb R^{k\times 1}$ and its variance must be the $k\times k$ matrix $$ \var(AX) = A\Sigma A^T. $$ This generalizes the $1\times 1$ case in which one multiplies by the square of the scale factor.

And how would one hope to show that $A\Sigma$ is non-negative-definite and symmetric, if that were thought to be the variance?