Conditions of convergence of stochastic subgradient algorithm

47 Views Asked by At

It is well known that for appropriate step size, $E[g^t] \in \partial f(x^t)$ is sufficient conditions for this subgradient algorithm to converge. What I'm wondering is whether the requirement has to hold conditioned on previous subgradients?

Say we're alternating between $k$ directions randomly (without replacement) some of which are not subgradients, but on average they are. Then $E[g^t]$ is always a subgradient, but not necessarily $E[g^t|g^{t-1}]$.

1

There are 1 best solutions below

0
On BEST ANSWER

Turns out I ran into an open problem. Theoretical properties of these kind of schemes are very poorly understood, but seem to be superior in practice, see arxiv.org/pdf/1202.4184.pdf and http://www.optimization-online.org/DB_FILE/2014/12/4679.pdf for instance.

Violating independence wrecks a bunch of assumptions needed to prove convergence, although this doesn't seem to matter much in practice.