In the very basic case of nice function on Hilbert space, dual space is same as primal, it seems to me that they to almost same thing: at point $x_t$ they apply a gradient obtained by looking "one step ahead" $x_{t+1} = x_t + \eta \nabla f(x_t + \eta \nabla f(x_t))$. Mirror prox adds an additional restriction on closeness of both "look ahead" and "gradient step" points in terms of some divergence:
$$ \hat z_t = \arg\min_z (D(z, z_t) + \langle \eta \nabla f(z_t), z \rangle)$$ $$ z_{t+1} = \arg\min_z (D(z, z_t) + \langle \eta \nabla f(\hat z_t), z \rangle)$$
Does the divergence-closeness part is the only difference? Should they perform similarly in saddle point problems without constraints?
Thank you.