I was trying to show the following:
$$(\hat{X}^T \hat{X} + \lambda n I)^{-1}\hat{X}^T \hat{y} = \hat{X}^T (\hat{X} \hat{X}^T + \lambda n I)^{-1} \hat{y}$$
I was told to use the Singular Value Decomposition of $\hat{X} = U \Sigma V^T = \sum^{r}_{i=1} \sigma_i u_i v_i^T$. So I tried:
$$ (\hat{X}^T \hat{X} + \lambda n I)^{-1} \hat{X}^T \hat{y} = (\hat{X}^T \hat{X} + \lambda n I)^{-1} (U \Sigma V^T)^\top \hat{y} $$
$$ ((U \Sigma V^T)^T (U \Sigma V^T) + \lambda n I)^{-1} V \Sigma U^T \hat{y} = ((V \Sigma^2 V^T) + \lambda n I)^{-1} V \Sigma U^T \hat{y} $$
however after that step I got stuck and it wasn't entirely obvious for me how to proceed. There are a lot of things that are confusing me about how proceed:
- First is that its not entirely clear to me that an inverse for $ (\hat{X}^T \hat{X} + \lambda n I)^{-1} = ((U \Sigma^2 V^T) + \lambda n I)^{-1}$ even exists.
- Second, even if it was invertible (i.e. an inverse existed), I'm not aware of any rules for sum of matrices and inverses (I think they do for transposes $(A + B)^T = A^T + B^T$ but not sure for inverses and can't find anything useful).
Anyone has any idea how to proceed? Or how I could further uses the SVD to show the equality I'm trying to show?
Let $X$ be an $m\times k$ matrix. You have $$(X^T X + \lambda n I_m) X^T = X^T (X X^T + \lambda n I_k).$$ When $-\lambda n$ is not in the spectrum of $X^T X$ nor $X X^T$ (e.g. when $\lambda n >0$) then you may invert to get: $$ X^T(X X^T + \lambda n I_k)^{-1} = (X^T X + \lambda n I_m)^{-1}X^T .$$