I'm looking for a result that I suspect should be fairly standard and well-known to probabilists and statistical physicists (and perhaps numerical/financial analysts). The result in question should say something like this:
Guess: Let $\beta>0$ real and let $V \colon \mathbb{R}^n \to \mathbb{R}$ be a smooth function such that $\int\exp(-\beta V)$ converges. Let $\varepsilon>0$ (“small”). Now consider the following random process with $x_0$ arbitrary: $$u_i \leftarrow \operatorname{Normal}(0,1)$$ $$x_{i+1} = x_i + \varepsilon u_i - \frac{\beta}{2}\, \varepsilon^2\, \nabla V(x_i)$$ (in other words, $x_i$ is a random walk with normal steps of variance $\varepsilon^2$ except that we also add a gradient descent term along $V$).
Then $x_i$ converges in distribution (when $i\to+\infty$) to a law having density $g_\varepsilon$, which itself converges to $g \propto \exp(-\beta V)$ when $\varepsilon \to 0$.
(That is, $g = \frac{\exp(-\beta V)}{\int\exp(-\beta V)}$, and I expect $g_\varepsilon$ to be some kind of convolution of $g$ by a normal distribution.)
So the point is, this “random walk plus gradient descent” process should have a stationary distribution, which for $\varepsilon>0$ small should be well approximated by (something proportional to) $\exp(-\beta V)$. Intuitively, the greater $V$, the less time the process spends in the region in question, and exponentially so for larger $V$.
(My goal is to construct a random process whose stationary distribution is proportional to some prescribed function, here represented by $\exp(-\beta V)$, and which looks locally like a random walk. I am aware of the Metropolis-Hastings algorithm, but the fact that it can leave the point unchanged is problematic to me. The above process is supposed to be analogous in spirit to M-H, however, and is perhaps describable as some kind of limit of it when taking a large number of very small steps.)
Of course $\beta$ plays no role in what I just wrote, it's just $\beta V$ that appears, but I believe leaving $\beta$ in place makes the result more understandable and perhaps more standard in its notations (in physicists' terms, $V$ should be a kind of “potential” and $\beta$ an “inverse temperature”).
I also expect a continuous version of the result to hold, and it is also of interest to, me, for a SDE of the form $dX_t = dB_t - \frac{\beta}{2}\, \nabla V(X_t)$, as a limit of the above discrete process.
So, question: is the above “guess”, or at least some reasonable approximation to it, correct (and a standard result)? What are the relevant keywords that I could search for to know more about this? And if my guess is completely wrong, is there something vaguely similar that would give a random walk which converges to a distribution proportional to a given $\exp(-\beta V)$?
To answer my first question, the keyword I was missing is “first-order Langevin diffusion”¹ (thanks ot Arthur B. for pointing it out to me).
The continuous-time version $dX_t = dB_t - \frac{\beta}{2}\, \nabla V(X_t)$, while more difficult to define rigorously, is more easy to explain: see this question on MathOverflow for a discussion. Nevertheless, convergence of the discrete-time version can also be studied precisely, esp. in relation to optimization algorithms, and this paper (Xu, Chen, Zou & Gu, “Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization”) seems to do a good job of explaining what is known in this domain.