Proof to a matrix identity encountered in kernelized ridge regression

961 Views Asked by At

In deriving kernelized ridge regression I've encountered a matrix identity that I can't quite prove:

$$X^T(XX^T+\lambda I)^{-1}=(X^TX+\lambda I)^{-1}X^T$$

Where $\lambda$ is a scalar constant and $X$ is assumed to be invertible.

I can see how this is true if the whole bracket is not an inverse itself, but after trying to apply some identities to expanding sum of inverses (namely, the Woodbury matrix identity), the whole thing just gets hopelessly complicated.

Can someone outline how this equality comes about?

The equation can be found on top of page 2 of this paper.

1

There are 1 best solutions below

0
On BEST ANSWER

If you multiply by $(XX^T+\lambda I) $ on the right and by $(X^TX+\lambda I) $ on the left, your identity is $$ (X^TX+\lambda I)X^T=X^T (XX^T+\lambda I), $$ which is trivial to verify.

What you need is not that $X $ is invertible, but that $\lambda>0$; that guarantees that $X^TX+\lambda I $ and $XX^T+\lambda I $ are positive definite.