I encountered during my reading to ridge regression that
$$(X^TX+\lambda I)^{-1}X^TX = I-\lambda(X^TX+\lambda I)^{-1}$$
What mathematical manipulation has been done here?
Thanks in advance
I encountered during my reading to ridge regression that
$$(X^TX+\lambda I)^{-1}X^TX = I-\lambda(X^TX+\lambda I)^{-1}$$
What mathematical manipulation has been done here?
Thanks in advance
On
Well it's true if $-\lambda$ is not in the spectrum of $X^TX$:
\begin{align*} (X^TX+\lambda I)^{-1}X^TX+\lambda(X^TX+\lambda I)^{-1}&=(X^TX+\lambda I)^{-1}X^TX+(X^TX+\lambda I)^{-1}\cdot\lambda I\\ &=(X^TX+\lambda I)^{-1}(X^TX+\lambda I) \\ &=I. \end{align*} Then subtracting $\lambda(X^TX+\lambda I)^{-1}$ from both sides, we obtain the result.
As far as finding a "mathematical manipulation" for this goes, I do not think there is one. However, it is similar to the relation which holds for real numbers $x\neq-\lambda$: $$\frac{x}{x+\lambda}=1-\frac{\lambda}{x+\lambda}. $$
@Aweygan Truly speaking, here is not a new answer ; but I want to stress that the issue (or its answer) could be made even simpler because the special form $X^TX$ plays no rôle.
More precisely, let us reformulate the question under the form :
Show that, for a $n \times n$ matrix A, we have :
$$(A+\lambda I)^{-1}A = I-\lambda(A+\lambda I)^{-1}$$
Answer : This relationship is equivalent to the equation obtained by pre-multiplying both sides by $(A+\lambda I)$ (assumed invertible), i.e., :
$$A=A$$
which is evidently true.