What is meant by this scaling limit of an ODE in Borkar's stochastic approximation book?

41 Views Asked by At

I'm reading Vivek Borkar's "Stochastic Approximation: A Dynamical Systems viewpoint" book (2008 version). At the beginning of chapter 3, he says:

enter image description here enter image description here

here, the update rule is: $$x_{n+1} = x_n + a(n) (h(x_n) + M_{n+1})$$

where $x_n$ are the iterates, $a(n)$ are step sizes (we can assume are decreasing as $a(n) = 1/n$), and $M_{n+1}$ is Martingale noise added at each step.

I understand the high level point, that he wants to show that the $x_n$ should track the solution of the ODE $\dot x = h(x(t))$ asymptotically, but I'm confused about what's going on here with the scaling.

As an example I considered $h(x) = -d (x - 5)$, with $d > 0$. That should be stable at $x = 5$ and the ODE would converge to that point for any $x$. However, for a very large $d$, say $d = 10^6$, the stochastic version would bounce around and diverge. My guess is that the "scaling" here is to handle this in a way that would make the stochastic sequence behave as the ODE would, but I can't understand how it works.

In my example, with that large $d$ value, the iterates of the original trajectory would be (approximately) $1, 10^6, -10^{12}, 10^{18}$, etc. So if we divided each by their norm, it looks like the sequence would be $1, 1, -1, 1, ...$, and I don't see how that would asymptotically track the limiting ODE.

For my example, the scaled functions would be $h_c(x) = h(cx) / c = -d(x - 5/c)$, so nearly the same dynamics, but with the origin as the equilibrium, as he mentions.

Can anyone shed any light on what these scaled $h_c$ functions are supposed to do, and how what he said about the scaled sequence is true?