Problems in understanding of markov chains in context of coupling from the past

Question

Problems in understanding of markov chains in context of coupling from the past

190 Views Asked by Bumbble Comm At 23 Feb 2026 - 8:33

I have read about the coupling from the past algorithm that is used for perfect sampling from the stationary distribution of a discrete markov chain. My question is not exactly about this algorithm, but why I can't apply its proof idea as well to "forward coupling", i.e. what is wrong with the following idea.

Let $\Omega$ be the set of states. Take a distribution on the space of "time step functions" $\mathcal{F} = \{f: \Omega \to \Omega\}$ that guarantees $\mathbb{P}[f(x) = y] = P(x, y)$ for all $x, y\in \Omega$ and some transition probabilities $P(x,y)$ which are aperiodic and irreducible. For each $t\in\mathbb{Z}$ choose $f_t$ at random, independently according to this distribution, and for $t_1 < t_2$ set $f_{t_1}^{t_2} := f_{t_2 -1}\circ ... \circ f_{t_1}$.

Now I want to do a "forward coupling". Let $M := \min\{t > 0: |f_0^t(\Omega)|=1\}$, i.e. the number of time steps so that starting at time $t=0$ the paths for all start values coalesce for the first time. Of course, since the $f's$ are chosen at random, $M$ is a random variable, and we assume that our markov chain guarantees that $\mathbb{P}[M < \infty] = 1$. Now $f_0^M$ is a random constant function, so let $\tilde{y}\in \Omega$ be its unique image, i.e. $f_0^M(\Omega)=\{\tilde{y}\}$.

Say $\pi$ is the stationary distribustion on $\Omega$ with respect to $P$. It is well known that $\tilde{y}$ has not distribution $\pi$ and this is why we need to "couple into the past" instead (which brings its own difficulties), and I have seen simple counterexamples, but I don't understand why, because for all $y\in\Omega, k>0$ and with an arbitrary $x\in\Omega$:

$$\mathbb{P}[\tilde{y}=y]=\mathbb{P}[f_0^M(\Omega)=\{\tilde{y}\}]=\mathbb{P}[f_0^M(x)=\tilde{y}] = \mathbb{P}[f_0^M\circ f_{-k}^0(x)=\tilde{y}] = \mathbb{P}[f_{-k}^M(x)=\tilde{y}] = P^{M+k}(x,\tilde{y})\longrightarrow \pi(\tilde{y})$$ when $k\longrightarrow\infty$. The third equation comes from the fact that $f_0^M$ is a constant function.

Please find the mistake. Thanks in advance.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 20 Sep 2020 - 2:08

If I understand your argument correctly, the issue does not appear to be in your math but rather the algorithmic implications. You're saying that you'll only draw a sample from the stationary distribution as $k\to\infty$, so basically you'll never know when to stop mixing and thus you can't guarantee an exact sample.

The key part of coupling from the past is that you "re-use" the randomness. For instance, instead of your $f_t(X)$ where $X$ is random, you have $f_t(x,U_t)$ where the $U_t$ are sampled once each. Then you basically start a chain "from the past" starting at each state $x$. If all chains couple by the time you get to $t=0$, you know you're at the stationary distribution. Otherwise, you restart from further in the past, but you reuse the same $U_t$ as previously (for the timesteps you already saw) so this part of the chain, given your previous run, is deterministic.

This way, if you find a good coupling, you can get a nice bound on the expected running time while ensuring an exact sample.

**Bumbble Comm** · Accepted Answer

I think I have found my own mistake.

The problematic point is the probability $P^{M+k}(x, \tilde{y})$. As I mentioned, $M$ is a random variable that depends on the randomly chosen $f's$, and this makes it illegal to argue that $P^{M+k}(x, \tilde{y})$ converges to $\pi$. So to say the value of $M$ contains information about the outcome of randomness.

In the same (wrong) way one could argue for an arbitrary $\tilde{y}\in\Omega$:

Start with an arbitrary state $x\in\Omega$, run the chain as many times as needed for a good mixing, say $T$ times, so that the distribution of $P^t(x, \cdot)$ is $\varepsilon$-close to the stationary distribution $\pi$ for all $t\geq T$, and let $M := \min \{T'\geq T: f_{0}^{M}(x)=\tilde{y}\}$. Then certainly after $M$ steps starting from $x$ the chain will have state $\tilde{y}$, by definition of $M$, and its distribution is $\varepsilon$-close to $\pi$, hence, since $\varepsilon$ ws arbitrary, $\pi(\tilde{y})=1$.

Wow, this tells me once again that one has to be careful in mathematics.

Problems in understanding of markov chains in context of coupling from the past

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in DISCRETE-MATHEMATICS

Related Questions in MARKOV-CHAINS

Related Questions in COUPLING

Trending Questions

Popular # Hahtags

Popular Questions