Stationary distribution of gradient-biased random walk

Question

Stationary distribution of gradient-biased random walk

112 Views Asked by Bumbble Comm At 11 May 2026 - 5:28

I'm looking for a result that I suspect should be fairly standard and well-known to probabilists and statistical physicists (and perhaps numerical/financial analysts). The result in question should say something like this:

Guess: Let $\beta>0$ real and let $V \colon \mathbb{R}^n \to \mathbb{R}$ be a smooth function such that $\int\exp(-\beta V)$ converges. Let $\varepsilon>0$ (“small”). Now consider the following random process with $x_0$ arbitrary: $$u_i \leftarrow \operatorname{Normal}(0,1)$$ $$x_{i+1} = x_i + \varepsilon u_i - \frac{\beta}{2}\, \varepsilon^2\, \nabla V(x_i)$$ (in other words, $x_i$ is a random walk with normal steps of variance $\varepsilon^2$ except that we also add a gradient descent term along $V$).

Then $x_i$ converges in distribution (when $i\to+\infty$) to a law having density $g_\varepsilon$, which itself converges to $g \propto \exp(-\beta V)$ when $\varepsilon \to 0$.

(That is, $g = \frac{\exp(-\beta V)}{\int\exp(-\beta V)}$, and I expect $g_\varepsilon$ to be some kind of convolution of $g$ by a normal distribution.)

So the point is, this “random walk plus gradient descent” process should have a stationary distribution, which for $\varepsilon>0$ small should be well approximated by (something proportional to) $\exp(-\beta V)$. Intuitively, the greater $V$, the less time the process spends in the region in question, and exponentially so for larger $V$.

(My goal is to construct a random process whose stationary distribution is proportional to some prescribed function, here represented by $\exp(-\beta V)$, and which looks locally like a random walk. I am aware of the Metropolis-Hastings algorithm, but the fact that it can leave the point unchanged is problematic to me. The above process is supposed to be analogous in spirit to M-H, however, and is perhaps describable as some kind of limit of it when taking a large number of very small steps.)

Of course $\beta$ plays no role in what I just wrote, it's just $\beta V$ that appears, but I believe leaving $\beta$ in place makes the result more understandable and perhaps more standard in its notations (in physicists' terms, $V$ should be a kind of “potential” and $\beta$ an “inverse temperature”).

I also expect a continuous version of the result to hold, and it is also of interest to, me, for a SDE of the form $dX_t = dB_t - \frac{\beta}{2}\, \nabla V(X_t)$, as a limit of the above discrete process.

So, question: is the above “guess”, or at least some reasonable approximation to it, correct (and a standard result)? What are the relevant keywords that I could search for to know more about this? And if my guess is completely wrong, is there something vaguely similar that would give a random walk which converges to a distribution proportional to a given $\exp(-\beta V)$?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2023-03-04 10:30:22

To answer my first question, the keyword I was missing is “first-order Langevin diffusion”¹ (thanks ot Arthur B. for pointing it out to me).

The continuous-time version $dX_t = dB_t - \frac{\beta}{2}\, \nabla V(X_t)$, while more difficult to define rigorously, is more easy to explain: see this question on MathOverflow for a discussion. Nevertheless, convergence of the discrete-time version can also be studied precisely, esp. in relation to optimization algorithms, and this paper (Xu, Chen, Zou & Gu, “Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization”) seems to do a good job of explaining what is known in this domain.

For a reason that escapes me, Langevin diffusion usually but not always refers to a second-order process where the diffusing particle is subjected to random-noise forces, see e.g., this Wikipedia article, rather than being directly controlled at the first order (by means of displacements, not forces) as in my question. The connection between the two is unclear to me (the first-order process is similar to an “overdamped” version of the second-order process, but I do not understand the exact relation, and if someone wants to post another reply explaining this better, they are more than welcome).

Stationary distribution of gradient-biased random walk

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in STOCHASTIC-PROCESSES

Related Questions in STOCHASTIC-ANALYSIS

Related Questions in GRADIENT-FLOWS

Trending Questions

Popular # Hahtags

Popular Questions