Gradient descent - does partial derivatives method always shows "real" optimum step down?

121 Views Asked by Bumbble Comm At 10 May 2026 - 8:11

I am learning ML and gradient descent and was wondering about something. To find the next optimal step we differentiate with respect to each theta in turn. But aren't there cases when the differential w.r.t theta 0 tells us that doing a step to the right will bring us to a lower point, then theta 1 differential tells us that doing a step forward will bring us to a lower point. Then we update those two simultaneously (do one step right, one forward) and find that we are actually at a higher point than we were. Because while 1,0 and 0,1 are lower than 0,0, 1,1 is actually higher.

Or imagine you are in a graph like this:

and you are at a point 0,0. Now it seems like differentials will always be zero, even though there are clearly lower points. Does the algorithm fail us in this case?

Is my understanding of partial derivatives not correct, or these factors are simply compensated for by the facts that in Gradient Descent we usually repeat the algorithm several times from different starting points and/or the graphs are rarely that complex?

sry if too noob and thanks.

Original Q&A

Gradient descent - does partial derivatives method always shows "real" optimum step down?

Related Questions in PARTIAL-DERIVATIVE

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions