Least Square Loss function Convexity

156 Views Asked by At

I was trying to prove that the least square error function is convex and I got stuck at the end when using the Hessian matrix, where I found that my Hessian is not equal to $H \neq U U^T$ but $H = U^TU$

I have no idea why?

Here's what I have tried:

Feature matrix $X=\left[\begin{array}{ll}{x_{11}} & {x_{12}} \\ {x_{21}} & {x_{22}}\end{array}\right]$, label vector $y=\left[\begin{array}{l} {y_{1}} \\ {y_{2}} \end{array}\right]$ and weights vector $w=\left[\begin{array}{l} {w_{1}} \\ {w_{2}} \end{array}\right]$

Loss function:

$$\sum_{i=1}^{2}\left(\hat{y}_{i}-y_{i}\right)^{2}=\left(\hat{y}_{1}-\left(w_{1} x_{11}+w_{2} x_{12}\right)\right)^{2}+\left(\hat{y}_{2}-\left(w_{1} x_{21}+w_{2} x_{22}\right)\right)^2$$

Second Derivative: $$ H =\nabla^2 l(w_1,w_2) = \left[\begin{array}{ll} {\frac{\partial^{2}}{\partial w_{1}^{2}}} & {\frac{\partial^{2} L}{\partial w_{1} \partial w_{2}}} \\ {\frac{\partial^{2} L}{\partial w_{2} \partial w_{1}}} & {\frac{\partial^{2} L}{\partial w_{2}^{2}}} \end{array}\right]=\left[\begin{array}{ll} {2 x_{11}^{2}+2 x_{21}^{2}} & {2 x_{12} x_{11}+2 x_{12} x_{22}} \\ {2 x_{11} x_{12}+2 x_{21} x_{22}} & {2 x_{12}^{2}+2 x_{22}^{2}} \end{array}\right]$$

Now I was expecting $H$ to be equal $X X^T$ but it is equal $X^TX$ $$ \begin{pmatrix}x_{11}&x_{12}\\ x_{21}&x_{22}\end{pmatrix}^T\begin{pmatrix}x_{11}&x_{12}\\ x_{21}&x_{22}\end{pmatrix}=2\begin{pmatrix}x_{11}^2+x_{21}^2&x_{11}x_{12}+x_{21}x_{22}\\ x_{12}x_{11}+x_{22}x_{21}&x_{12}^2+x_{22}^2\end{pmatrix} $$

so why is this happening knowing that this loss function should be convex?

1

There are 1 best solutions below

0
On BEST ANSWER

Note that $H$ being equal to $X^TX$ does not contradict convexity. Given any nonzero vector $a$, we have $a^TX^TXa = (Xa)^T(Xa) > 0$, hence $H$ is positive definite.