Gradient descent converges to critical points for non-convex functions

475 Views Asked by At

i was wondering, is there a way to show that gradient descent can converge only to critical points or escape to infinity in a non convex function, assuming that it is C2 and also, as a result, Lipschitz continuous for the gradient? The usual proofs use convexity in order to show convergence for convex functions (for example

http://www.stat.cmu.edu/~ryantibs/convexopt-F13/scribes/lec6.pdf

), and in most cases i have seen it is taken as granted for non convex cases.

1

There are 1 best solutions below

2
On

Honestly, it's not guaranteed to converge. The only way to guarantee convergence in a finite number steps is to use a globalization method such as a trust-region or line-search method. However, these globalization techniques are not unique to gradient descent. Newton methods also use them in order to guarantee convergence from an arbitrary starting point.

The standard reference for these globalization techniques is Nocedal and Wright's Numerical Optimization. It contains proofs for the convergence of both methods.