Where do these convergence conditions come from?

145 Views Asked by At

I have been reading a book on Reinforcement Learning and the author mentions "a well-known result in stochastic approximation theory" that gives us the conditions required to assure convergence in probability 1. Screenshot of these conditions below:

enter image description here

What is this well-known result? I've been trying to google around for it but am having trouble finding it.

$\ $

Edit for additional clarity

Here the $\alpha_n$ are step size parameters for processing rewards. You can see them being defined on p.25 (as it appears on the book page not page count in a PDF reader) of the book, with the below screenshot taken from p.26.

1

There are 1 best solutions below

0
On

In a lecture from David Silver, he mentions that these are the conditions for the Robbins-Monro algorithm to converge, which is what I was looking for.