An interesting first order method

481 Views Asked by Bumbble Comm At 31 Mar 2026 - 9:49

Consider the problem of minimizing $f(x)+g(x)$ over $\mathbb{R}^n$, both are smooth, and the following first order method: $$ x_{k+1}= \operatorname{argmin}_x\left \{ \nabla f(x_k)^T(x-x_k) + \frac{1}{\alpha^{p-1} p} \|x-(x_k - \alpha \nabla g(x_k))\|_p^p \right \} $$

Is it a special case of some known method?

What about the stochastic setting, when $f(x) = \frac{1}{m} \sum_{i=1}^m f_i(x)$, replacing $\nabla f$ by $\nabla f_i$, where $i$ is chosen at random?

Note that it is easy to see that for $p=2$, we obtain exactly the (stochastic) gradient method with step-size $\alpha$.

The method was coded by mistake, instead of a more natural method which we aimed to test, $$ x_{k+1}= \operatorname{argmin}_x \left\{ [\nabla f(x_k) + \nabla g(x_k)]^T(x-x_k) + \frac{1}{\alpha^{p-1} p} \|x-x_k\|_p^p \right \} $$ with $p=3$. The stochastic version of the 'mistake' turned out to perform much better than the stochastic version of the more natural method above.

Original Q&A

An interesting first order method

Related Questions in REFERENCE-REQUEST

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Trending Questions

Popular # Hahtags

Popular Questions