Covariance matrix sandwiched between matrix (e.g. matrix $X$) and its transpose ($X'$).

1.1k Views Asked by At

I probably just need more experience in the field of machine learning (and linear algebra in general), but I keep encountering this sort of pattern in some of the equations I have been reviewing lately. The pattern here is a covariance matrix $\Sigma$ sandwiched in between a single matrix and that matrix's transpose. e.g.,

$X\Sigma X'$

I am wondering if anyone can elucidate for me why this pattern appears in numerous applications or provide any intuition behind the result of this computation. In the exact equation I am considering, $X$ is a "transfer matrix function", and the covariance describes covariance between variables of the transfer matrix function. For this application's purpose, the covariance matrix is "diagonalized" and (typically small) values on the off-diagonal are set to zero. More details on the exact application I am studying can be found here:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3971884/

2

There are 2 best solutions below

2
On BEST ANSWER

It is a general fact that if $Z$ is a random vector with mean $\mu$ and covariance $\Sigma$, and $X$ is a constant matrix, then $\mbox{cov}(XZ)=X \Sigma X^{T}$. The expression you’ve asked about gives the Covariance for a linear transformation of a random vector.

To show this,

$\mbox{cov}(XZ)=E[(XZ-X\mu)(XZ-X\mu)^{T}]$

$\mbox{cov}(XZ)=XE[(Z-\mu)(Z-\mu)^{T}]X^{T}=X \mbox{cov}(Z) X^{T}$

0
On

$$ \DeclareMathOperator{\E}{\mathbb{E}} \begin{align} \cov(Z) :&= \E\bigl[ (Z-\E Z)(Z-\E Z)^\top \bigr] \tag*{definition} \\ \\ \cov(XZ) &= \E\bigl[ (XZ-\E[XZ])(XZ-\E[XZ])^\top \bigr] \tag*{by definition} \\ &= \E\bigl[ (XZ-X\E Z)(XZ-X\E Z)^\top \bigr] \tag*{linearity of expectation} \\ &= \E\bigl[ X(Z-\E Z)\bigl(X(Z-\E Z)\bigr){}^{\!\top} \bigr] \tag*{factor} \\ &= \E\bigl[ X(Z-\E Z)(Z-\E Z)^\top X^\top \bigr] \tag*{$(ab)^\top = b^\top a^\top$} \\ &= X \E\bigl[ (Z-\E Z)(Z-\E Z)^\top \bigr] X^\top\tag*{linearity of expectation} \\ &= X \cov(Z) X^\top\tag*{by definition} \end{align} $$

Linearity of expectation.