Constant step lengths in subgradient method

154 Views Asked by Bumbble Comm At 11 May 2026 - 1:00

I was reading these notes (if the previous link doesn't work, use this) on the subgradient method, it says that the choice for step sizes (or step lengths) are determined before the algorithm is run, I cannot seem to understand why is this done? What would happen if we use an algorithm similar to what we use in gradient descent, perhaps something like adagrad?

I'm trying to implement an SVM using a mini-batch stochastic gradient descent kind of method, while it does classify data points correctly. The problem is it doesn't quite maximize the slab thickness.

PS: If the link isn't accessible, find the relevant information below

Step size rules

Constant step size, $\alpha_k = \alpha$ is a positive constant, independent of $k$.
Constant step length, $\alpha_k = \gamma/\left|\left| g^{(k)} \right|\right|_2$. This means that $\left|\left| x^{(k+1)}-x^{(k)} \right|\right|_2 = \gamma$.

There are 3 more methods listed, which are square summable but not summable, Nonsummable diminishing, Nonsummable diminishing step lengths.

The notes further say

"The most interesting feature of these choices is that they are determined before the algorithm is run; they do not depend on any data computed during the algorithm. This is very different from the step size rules found in standard descent methods, which very much depend on the current point and search direction."

Original Q&A

Constant step lengths in subgradient method

Step size rules

Related Questions in OPTIMIZATION

Related Questions in NUMERICAL-OPTIMIZATION

Related Questions in GRADIENT-DESCENT

Related Questions in SUBGRADIENT

Related Questions in NON-SMOOTH-OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions