As we know, for solving saddle point problems, forward-backward algorithm is generally not guaranteed to converge. But extragradient method converge Structured Prediction via the Extragradient Method
Suppose $L(w,z)$ is a convex in $w$-concave in $z$ function. Can we express extragradient, \begin{align} w^{k+1/2}&=P_{W}(w^k-\beta \bigtriangledown_w L(w^k,z^k))\cr z^{k+1/2}&=P_{Z}(z^k+\beta \bigtriangledown_z L(w^k,z^k))\cr w^{k+1}&=P_{W}(w^k-\beta \bigtriangledown_w L(w^{(k+1/2)},z^{(k+1/2)}))\cr z^{k+1}&=P_{Z}(z^k+\beta \bigtriangledown_z L(w^{(k+1/2)},z^{(k+1/2)}))\cr \end{align} with only proximal steps? I mean without using gradient of function $L$. Sorry for asking the same question on mathoverflow.
There is a refinement the HPE paper here. Consider the problem model in equation (7). In the jargon of the latter paper, the problem you're trying to solve corresponds to a Nash-equilibrium problem with $\Psi_1(z, w) = -\Psi_2(z, w) := \sum_{i}w^TF_iz_i$ (a billinear form), $g_1(z) := \sum_{i}c_i^Tz_i$, and $g_2(w) := \sum_{i}w^TF_iy^{(i)} + I_{\|.\| \le 1}$. Now invoke algorithm T-BD (or Teng's BD-HPE; see the same paper). It should be trivial to compute the proximal operators of the $g_k$'s.