Casting Ridge Regression to OLS

54 Views Asked by At

OLS solves the problem below and has closed-form solution $\hat{\beta} = (X^\top X)^{-1}Xy$ $$ \min_\beta \|y - X\beta\|^2. $$ Ridge regression solves the problem below and has closed-form solution $\hat{\beta} = (X^\top X + \lambda I)^{-1}Xy$ $$ \min_\beta \|y - X\beta\|^2 + \lambda \|\beta\|^2. $$ Is it correct to say that a ridge regression problem is just an OLS problem with the following structure: $$ \tilde{y} = \begin{pmatrix} y\\ 0 \end{pmatrix} \qquad \tilde{X} = \begin{pmatrix} X \\ -\sqrt{\lambda}I \end{pmatrix} $$ since $$ \|y-X\beta\|^2 + \lambda\|\beta\|^2 = \|\tilde{y} - \tilde{X}\beta\|^2 $$

1

There are 1 best solutions below

0
On

If you have the system

$$y_1 = \beta_0 + \beta_1x_{11} + \ldots \beta_px_{p1}$$ $$y_2 = \beta_0 + \beta_1x_{12} + \ldots \beta_px_{p2}$$ $$\vdots $$ $$y_N = \beta_0 + \beta_1x_{1N} + \ldots \beta_px_{pN}$$ $$0 = \sqrt{\lambda}\beta_0$$ $$0 = \sqrt{\lambda}\beta_1$$ $$\vdots$$ $$0 = \sqrt{\lambda}\beta_p$$

Then you can write the new data matrix as

$$ \begin{bmatrix} X\\ \pm \sqrt{\lambda}I\\ \end{bmatrix} $$

and the observations as you proposed. Note the arbitrary sign. The square will make the negative sign positive agian.