Convergence of stochastic approximation with markov chain input and some nontrivial dependence on input?

13 Views Asked by At

Assume $u_t$ is some time-homogenous Markov-Chain with unique stationary distribution $\pi$.

Consider iterations of the form

$$x_{t} = f(y_t,u_t)$$ $$y_{t+1} = y_t + \varepsilon_t \nabla_y g(v,w,y)\vert_{v = x_{t+1},w=x_t,y=y_t},$$

where $g$ is some function.

Is there any theory on how to check the convergence of this kind of update rule? There are some similarities to stochastic approximation/the Robbins-Monro-algorithm, but of course there are key differences:

  • One has the dependence of $x_t$ on both $y_t$ and $u_t$, such that $x_t$ itself is not time homogenoeus.

  • The updates depend on $x_{t+1},x_t$ (in itself I guess this might not be such a problem, at least for the $u_t$ the process $(u_{t+1},u_t)$ would still have the same properties like time-homogeneity)

  • One ignores the dependencies of $x_t$ on $y_t$ by only taking the partial gradient of $g$

My question is now if despite these differences,under some conditions there will still be some convergence to

$$y^* = \arg \max_y \mathbb{E}_{u_{t+1},u_t}[g(f(y^*,u_{t+1}),f(y^*,u_t),y^*)]$$

and whether there is some reference that might help to understand this setting.