Finding Minimum of Error Function. Confused About Meaning of Critical Values.

1.3k Views Asked by user423167 At 10 May 2026 - 4:04

I have the error function

$E(\mathbf{w}) = \dfrac{1}{2} \sum\limits_{n = 1}^N \{ y(x_n, \mathbf{w}) - t_n \}^2$,

where

$y(x, \mathbf{w}) = w_0 + w_1x + w_2x^2 + \dots + w_Mx^M = \sum\limits_{j = 0}^M w_j x^j$

This is the sum of squares of the errors between the prediction $y(x_n, \mathbf{w})$ for each data point $x_n$ and the corresponding target values $t_n$.

By substitution, we have

$E(\mathbf{w}) = \dfrac{1}{2} \sum\limits_{n = 1}^N \left( \sum\limits_{j = 0}^M w_jx^j_n - t_n \right)^2$

When finding the minimum of the error function, we set it equal to $0$:

$\dfrac{\partial{}}{\partial{}w_i} E(\mathbf{w}) = \sum\limits_{n = 1}^N x_n^i \left( \sum\limits_{j = 0}^M w_jx_n^j - t_n \right) = 0$

The error function is always positive. However, that does not necessarily mean that it is always increasing. This means that it could have many critical values, right? In which case, how do we know that we are solving for the global minimum and not a local minimum? My textbook says to minimise the error function, but that only makes sense if there's a global minimum, right?

So in reality when we set it equal to $0$, as was done above, what critical point are we actually solving for?

I've become very confused thinking about this, so I'd appreciate any help and explanations to clear this up.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 09 Apr 2018 - 1:22 BEST ANSWER

This is an ordinary linear least squares problem, https://en.wikipedia.org/wiki/Linear_least_squares_(mathematics)#Computation.

Because the optimization problem is convex (convex quadratic objective function with no constraints), every stationary point is a global minimum.

If the problem is non-degenerate (full rank), there will be a unique global minimizing solution, otherwise there could be an infinity of solutions, all obtaining the same global minimum objective value. There is widely available specialized software to robustly solve such problems. See https://stats.stackexchange.com/questions/160179/do-we-need-gradient-descent-to-find-the-coefficients-of-a-linear-regression-mode/164164#164164

Finding Minimum of Error Function. Confused About Meaning of Critical Values.

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in PARTIAL-DERIVATIVE

Related Questions in MAXIMA-MINIMA

Related Questions in ERROR-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions