Prove that the regularization term in RLS makes the matrix invertible

1.2k Views Asked by At

I am taking a machine learning course, and I learned that in LS (least square), we have

$$\hat{\theta} = (\Phi\Phi^T)^{-1}\Phi y$$

But for RLS (regularized least square), we have

$$\hat{\theta} = (\Phi\Phi^T + \lambda \textbf{I})^{-1}\Phi y$$

$\Phi \in \mathbb R^{d \times n}$, where $d$ is the number of features and $n$ is the number of samples.

The professor mentioned that, when $n \lt d$ (sample size is smaller than feature size), the term $\Phi \Phi^T$ will be noninvertible, and will cause overfitting (which I understand). So the extra term in RLS ($\lambda \textbf{I}$) acts as a regularization term to make sure the matrix

$$\Phi \Phi^T + \lambda \textbf{I}$$ is invertible. This part is where I get stuck. I don't see why adding an extra term $\lambda \textbf{I}$ can make the whole matrix invertible. Is there a proof for this?

1

There are 1 best solutions below

0
On

I think you can prove it using positive definiteness of a matrix. Note that for any $v\in \mathbb{R}^{d},\, v^{T}(\phi\phi^{T})v=(\phi^{T} v)^T(\phi^{T} v)\geq0$ and $\lambda I$ is already positive definite because $\lambda>0$. Thus,

$v^{T}(\phi\phi^{T}+\lambda I)v = v^{T}(\phi\phi^{T})v + v^{T}\lambda I v\geq v^{T}(\lambda I)v>0$.

Therefore, $(\phi\phi^{T}+\lambda I)$ is positive definite and hence invertible.