I'm learning about Tikhonov regularization
$$\underset{x\in X}{\arg\inf}\left\{||Ax-b||^2+\lambda ||x||^2\right\}$$
I have read that the solution keeps the residual $||Ax-b||^2$ small and is stabilized through the $\lambda ||x||^2$ term. Can anyone help me understand why that is? I can see that the term prevents overfitting, but I can't quite see how it helps stabilizing.
Thanks in advance.
Assuming $\|\cdot\|$ is the $L_2$ norm, the solution for $x$ is \begin{align*} x = (A^T A + \lambda I)^{-1}A^T b \end{align*} The instability in this solution lies in the inverse. If $A$ have columns which are nearly linearly dependent, then $A^TA$ is "nearly non-invertible". In other words, the condition number will be very large. The $\lambda I$ helps stabilize this inverse, and will always lower the condition number.