Least square method, why gradient gives the minimum and not the maximum?

78 Views Asked by At

I'm bit confused about finding critical points of functions.

Studying the least square method we got some data $\{(y_{1},x_{1}),...,(y_{n},x_{n}))\}$ and can define error associating with $y=ax+b$ by

$$E(a,b)=\displaystyle\sum_{i=1}^{n}(y_{i}-[ax_{i}+b])^{2}$$

We want to find the minimum of $E$ and so we find where its gradient is equal to zero. i.e.

$$\frac{\partial E}{\partial a}=0=\frac{\partial E}{\partial b}$$

but, by doing that we would fint just a critical point, it could be a minimum or maximum or neither of them.

But anyway this calculus give me the exactly minimum points, my doubt is why that? do we have the second derivative criterion here?