Understanding stochastic approximation for a function.

87 Views Asked by At

I am trying to understand the paper https://arxiv.org/pdf/1606.06988.pdf. Currently I am stuck with the part

Let us recall that, in order to construct a stochastic algorithm, which approximates the function $f$ at a given point $x$ , we need to define an algorithm of search of the zero of the function $h : y → f (x) − y$. Following Robbins-Monro’s procedure, this algorithm is defined by setting $f_0 ( x ) ∈ R$, and, for all $n ≥ 1$,

$f_n ( x ) = f_{n − 1} ( x ) + \gamma_n W_n$,

where $W_n ( x )$ is an observation of the function h at the point $f_{n − 1} ( x )$, and the stepsize ($\gamma_n$) is a sequence of positive real numbers that goes to zero.

Can someone explain this part to me? I compared this part to https://en.wikipedia.org/wiki/Stochastic_approximation#Robbins.E2.80.93Monro_algorithm, but I don't quite get it.

We can rewrite $h : y → f (x) − y$ to $0 = h(y) = f (x) − y$ to $f(x) = y$, which looks like $M(\theta)=\alpha$ from Wikipedia, meaning $y$ is $\alpha$ and $f(x)$ is $M(\theta)$. On the other hand, inspecting $f_n(x) = f_{n − 1} ( x ) + \gamma_n W_n$, which looks like $\theta _{n+1}=\theta _{n}-a_{n}(N(\theta _{n})-\alpha )$, I understand that $f(x)$ is $\theta$. So what is $f(x)$?

So: How is $0 = h(y) = f (x) − y$ leading to an approximation of $f(x)$? (Disclaimer: I am not sure I understand stochastic approximation in the first place.)