Ridge Regression Least Squares

107 Views Asked by At

What is the right way to find the derivative of F on $\beta_0$ and find $\beta_0$?

$ F = \frac{1}{2} \|X \beta + \mathbf{1}\beta_0 - y\|^2 + \tfrac{C}2\|\beta\|^2 $, where

$X$ is $(n_{samples} \times n_{features})$ matrix
$y$ is $(n_{samples} \times \ , )$ vector
1 is a column vector of ones

Here is my solution:

$ \frac{\partial F}{\partial \beta_0} = (X \beta + $1$ \beta_0 - y)$1$ = 0 $

1$^T (X \beta + $1$ \beta_0 - y) = 0 $

1$^TX \beta + $1$^T$1$\beta_0-$1$^Ty = 0$

1$^TX \beta + \beta_0-$1$^Ty = 0$

$\beta_0 = $1$^Ty - $1$^TX \beta $

$\beta_0 = $1$^T(y - X \beta )$

1

There are 1 best solutions below

0
On

What you can do is to adapt this into a known form.

Define $ \hat{X} = \left[ \boldsymbol{1}, X \right] $ and $ \hat{\beta} = \left[ {\beta}_{0}, {\beta}^{T} \right]^{T} $.
Also define $ D $ which is the Identity Matrix with only $ {D}_{ii} = 0 $.

Then you problem can be written as:

$$ z = \arg \min_{\hat{\beta}} \frac{1}{2} \left\| \hat{X} \hat{\beta} - y \right\|_{2}^{2} + \frac{C}{2} \left\| D \hat{\beta} \right\|_{2}^{2} $$

Now this is easily solved by:

$$ z = \left( \hat{X}^{T} \hat{X} + c {D}^{T} D \right)^{-1} \hat{X}^{T} y $$.

Now $ {\beta}_{0} = {z}_{1} $ and $ \beta = \left[ {z}_{2}, {z}_{3}, \ldots \right] $.