how is local minima possible in gradient descent?

161 Views Asked by Bumbble Comm At 10 May 2026 - 2:59

gradient descent works on the equation of mean squared error, which is an equation of a parabola y=x^2

we often say weight adjustment in a neural network by gradient descent algorithm can hit a local minima and get stuck in there.

My question is, how is local minima possible on the equation of a parabola, where the slope is always parabolic !

Original Q&A

There are 1 best solutions below

user65203 On 23 Apr 2019 - 12:02

The behavior is parabolic close to a minimum, but there can be as many minima as you want !

Think of a total least-squares line fitting problem where there are just four points forming a square. By symmetry, there must be several solutions (diagonals or medians), and there will be several local minima.

how is local minima possible in gradient descent?

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Related Questions in NEURAL-NETWORKS

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in MEAN-SQUARE-ERROR

Trending Questions

Popular # Hahtags

Popular Questions