Minimizing a function with vectors

305 Views Asked by At

This is a part of a problem that I'm having, and I'm unclear how to do this particular step. I'm dealing with a ridged regression and I need to minimize the equation

$$\sum (Y_i - \beta^Tx_i)^2 + \lambda\sum\beta_j^2$$

The question says I need to find $\beta$ that minimizes this and I'm not sure how to do that because I have a vector.

How do I do this?

1

There are 1 best solutions below

1
On

Lets us define $$ E = \sum (Y_i -\beta^T x_i)^2+ \lambda \sum \beta_j^2$$ And we want to find optimal values for $\beta$. Thus we take the derivative of $E$ with respect to $\beta$ (Note that $x_{ij}$ means component $j$ of vector $x_i$). \begin{align} \nabla_\beta E &= \begin{pmatrix} \frac{\partial}{\partial \beta_1}\\ ..\\ \frac{\partial}{\partial \beta_n}\end{pmatrix}E \\ &= \begin{pmatrix} \frac{\partial}{\partial \beta_1}(\sum (Y_i -\beta^T x_i)^2+ \lambda \sum \beta_j^2)\\ ..\\ \frac{\partial}{\partial \beta_n}(\sum (Y_i -\beta^T x_i)^2+ \lambda \sum \beta_j^2) \end{pmatrix} \\ &= \begin{pmatrix} \sum -2x_{i1}(Y_i -\beta^T x_i)+ 2\lambda \beta_1\\ ..\\ \sum -2x_{in}(Y_i -\beta^T x_i)+ 2\lambda \beta_n \end{pmatrix} \\ &= \begin{pmatrix} \sum -2x_{i1}Y_i +\sum 2x_{i1} \beta^T x_i+ 2\lambda \beta_1\\ ..\\ \sum -2x_{in}Y_i +\sum 2x_{in} \beta^T x_i+ 2\lambda \beta_n\end{pmatrix} \\ &= \begin{pmatrix} \sum -2x_{i1}Y_i +2\sum \beta_j \sum x_{i1} x_{ij}+ 2\lambda \beta_1\\ ..\\ \sum -2x_{in}Y_i +2\sum \beta_j \sum x_{in} x_{ij}+ 2\lambda \beta_n\end{pmatrix} \\ &= A\beta +r \end{align} Where \begin{align} A = 2\begin{pmatrix} \lambda +\sum x_{i1}x_{i1} & \sum x_{i1}x_{i2} & \sum x_{i1}x_{i3} & ... & \sum x_{i1}x_{in} \\ \sum x_{i2}x_{i1} & \lambda + \sum x_{i2}x_{i2} & \sum x_{i2}x_{i3} & ... & \sum x_{i2}x_{in} \\ ... & ... & ... & ... & ... & \\ \sum x_{in}x_{i1} & \sum x_{in}x_{i2} & \sum x_{in}x_{i3} & ... &\lambda+ \sum x_{in}x_{in} \end{pmatrix} \end{align} And \begin{align} r = -2\begin{pmatrix}\sum x_{i1}Y_i \\ ..\\\sum x_{in}Y_i \end{pmatrix} \end{align} To obtain a minima, we need to solve $A\beta+r=0$. This gives you $\beta$.

I really encourage you, to check my indices and sums, as I did this rather quick.