Stochastic gradient descent for convex optimization

1.3k Views Asked by Bumbble Comm At 30 Mar 2026 - 8:34

What happens if a convex objective is optimized by stochastic gradient descent? Is a global solution achieved?

There are 1 best solutions below

Bumbble Comm On 23 Jan 2014 - 12:01 BEST ANSWER

Eventually :). A convex objective function has no local minima, so, for every point in the domain, the integral curve of the gradient vector field from that point leads to the global minimum and SGD approximately follows the integral curve as long as the learning rate is small enough. It is not even necessary to track the integral curve very closely, since a gradient integral curve leads the global minimum from wherever you land, you just need to keep moving downhill. The real issue is the rate of progress, which can easily stall in regions with small gradient. Avoiding this also depends on the learning rate, making the learning rate critical to the practical use of SGD. I believe there are theoretical guarantees of convergence if you vary the learning rate inversely with time. The best place to look for specific results is in Leon Bottou's papers and the Pegasos paper. The current focus of research seems to be on varying the learning rate adaptively during training based on the learning progress thus far. An early version of this was the passive-aggressive perceptron. More recent methods like "natural gradient" and the AROW algorithm adaptively maintain a separate learning rate for each component of the gradient.

Stochastic gradient descent for convex optimization

There are 1 best solutions below

Related Questions in NUMERICAL-METHODS

Related Questions in CONVEX-ANALYSIS

Related Questions in CONVEX-OPTIMIZATION

Related Questions in MACHINE-LEARNING

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions