How do mirror prox and accelerated methods differ conceptually?

149 Views Asked by Bumbble Comm At 24 Apr 2026 - 4:19

In the very basic case of nice function on Hilbert space, dual space is same as primal, it seems to me that they to almost same thing: at point $x_t$ they apply a gradient obtained by looking "one step ahead" $x_{t+1} = x_t + \eta \nabla f(x_t + \eta \nabla f(x_t))$. Mirror prox adds an additional restriction on closeness of both "look ahead" and "gradient step" points in terms of some divergence:

$$ \hat z_t = \arg\min_z (D(z, z_t) + \langle \eta \nabla f(z_t), z \rangle)$$ $$ z_{t+1} = \arg\min_z (D(z, z_t) + \langle \eta \nabla f(\hat z_t), z \rangle)$$

Does the divergence-closeness part is the only difference? Should they perform similarly in saddle point problems without constraints?

Thank you.

Original Q&A

How do mirror prox and accelerated methods differ conceptually?

Related Questions in DIVERGENCE-OPERATOR

Related Questions in GRADIENT-DESCENT

Related Questions in CONVERGENCE-ACCELERATION

Trending Questions

Popular # Hahtags

Popular Questions