In ridge regression, given the SVD of matrix $X$, how to prove the following formula?

214 Views Asked by At

Given the SVD of $X$ $$X = U\Sigma V^T$$

In typical ridge regression, we have

$$\hat{y}_{ridge} = X(X^TX + \lambda I)^{-1}X^Ty \\ =U\Sigma V^T(V\Sigma^2 V^T + \lambda I)^{-1}V\Sigma U^Ty\\ $$

Now my question is, how to prove that,

$$U\Sigma V^T(V\Sigma^2 V^T + \lambda I)^{-1}V\Sigma U^Ty= U\Sigma (\Sigma^2 + \lambda I)^{-1}\Sigma U^Ty$$

Without the $\lambda I$ term it is obvious to see that

$$U\Sigma V^T(V\Sigma^2 V^T)^{-1}V\Sigma U^Ty = UU^Ty$$

but with that additional term, I don't really understand how to get the result shown above.

1

There are 1 best solutions below

2
On BEST ANSWER

Note that we have $I=VV^T$, hence

\begin{align}U\Sigma V^T(V\Sigma^2 V^T + \lambda I)^{-1}V\Sigma U^Ty&=U\Sigma V^T(V\Sigma^2 V^T + \lambda VV^T)^{-1}V\Sigma U^Ty \\&= U\Sigma V^T\left(V(\Sigma^2 + \lambda I)V^T\right)^{-1}V\Sigma U^Ty \\&= U\Sigma V^TV(\Sigma^2 + \lambda I)^{-1}V^T V\Sigma U^Ty \\&= U\Sigma (\Sigma^2 + \lambda I)^{-1}\Sigma U^Ty \end{align}

Edit:

Suppose $X=U_r\Sigma_rV_r^T=\begin{bmatrix} U_r & U_n \end{bmatrix}\begin{bmatrix} \Sigma_r & 0 \\ 0 & 0\end{bmatrix}\begin{bmatrix} V_r \\ V_n \end{bmatrix},$

\begin{align} X(X^TX+\lambda I)^{-1}X^T &=\begin{bmatrix} U_r & U_n \end{bmatrix}\begin{bmatrix} \Sigma_r & 0 \\ 0 & 0\end{bmatrix}\begin{bmatrix} \Sigma_r^2+\lambda I_{r} & 0 \\ 0 & \lambda I_{n-r}\end{bmatrix}^{-1}\begin{bmatrix} \Sigma_r & 0 \\ 0 & 0\end{bmatrix}\begin{bmatrix} U_r^T \\ U_n^T \end{bmatrix} \\ &=\begin{bmatrix} U_r\Sigma_r & 0 \end{bmatrix}\begin{bmatrix} (\Sigma_r^2+\lambda I_{r})^{-1} & 0 \\ 0 & \lambda^{-1} I_{n-r}\end{bmatrix}\begin{bmatrix} \Sigma_rU_r^T \\ 0 \end{bmatrix} \\ &=U_r\Sigma_r(\Sigma_r^2+\lambda I_r)^{-1}\Sigma_rU_r^T \end{align}