Trying to Understand Coupling Arguments Through a Simple Markov Chain Example

Question

Trying to Understand Coupling Arguments Through a Simple Markov Chain Example

708 Views Asked by Bumbble Comm At 22 Feb 2026 - 9:46

Let $X=(X_t)_{t=0}^\infty$ be an irreducible and aperiodic Markov chain on a finite state space $\Omega$. We know that $X$ admits a unique stationary state $\pi$. For each $x\in \Omega$, let $p_x^{(t)}$ denote the distribution of $X_t$ given that the starting state is $x$. In other words, $p_x^{(t)}(y)$ is the probability of landing on $y$ at the $t$-th step if we had started at $x$.

We define $\Delta_x(t)=\|p^{(t)}_x-\pi\| = \frac{1}{2}\sum_{y\in \Omega} |p^{(t)}_x(y)-\pi(y)|$. That is, $\Delta_x(t)$ is the variation distance between $p^{(t)}_x$ and $\pi$.

Fact. $\Delta_x(t)$ is a non-increasing funciton of $t$.

Writing $P$ to denote the transition matrix of the Markov chain, the above fact is easy to see by a simple application of triangle inequality to $\Delta_x^{(t+1)} = \|p^{(t)}_xP-\pi P\|$.

But my question is the following.

In this PDF, the above fact is proved in Claim 3.8 using a coupling argument. I was able to follow the argument formally. But can somebody please provide some intuition behind this.

Claim 3.8 $\Delta_x(t)$ is non-increasing in $t$.

Proof: Let $X_0 = x$ and $Y_0$ have the stationary distribution $\pi$. We fix $t$ and couple the distributions of the random variables $X_t$ and $Y_t$ such that $\Pr[X_t \ne Y_t] = \| p_x^{(t)} - \pi\| = \Delta_x(t)$, which is possible because of the Coupling Lemma. We now use this coupling to define a coupling of the distributions of $X_{t+1}$ and $Y_{t+1}$ as follows:

If $X_t = Y_t$, then set $X_{t+1} = Y_{t+1}$.

Otherwise, let $X_t \to X_{t+1}$ and $Y_t \to Y_{t+1}$ independently.

Then we have $$\Delta_x(t+1) \equiv \|p_x^{(t+1)} - \pi\| \le \Pr[X_{t+1} \ne Y_{t+1}] \le \Pr[X_t \ne Y_t] = \Delta_x(t).$$ The first inequality holds because of the Coupling Lemma, and the second inequality is true by the construction of the coupling.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The intuition behind the Coupling Lemma is that $\Delta_x(t) = \|p_x^{(t)} - \pi\|$ is the smallest that the probability $\Pr[X_t \ne Y_t]$ can get for any two random variables such that $X_t$ has the $p_x^{(t)}$ distribution and $Y_t$ has the $\pi$ distribution.

When we generate such a coupling in this case, you shouldn't think of $X_t$ and $Y_t$ as being the states of the $t^{\text{th}}$ step of some Markov chain. $X_t$ is just a random state sampled from the $p_x^{(t)}$ distribution, and $Y_t$ from the $\pi$ distribution. They're coupled in a way to be equal with the highest probability possible. (Just pick $X_t = Y_t = s$ with probability $\min\{p_x^{(t)}(s), \pi(s)\}$ for all states, and then fill in the remaining probability mass arbitrarily to make $X_t$ and $Y_t$ have the right distributions.)

We use the probabilities from $p_x^{(t)}$ and $\pi$ to put two tokens $X$ and $Y$ down on the Markov chain. Then they take a step, following the Markov chain rules, and also following the rule that if they start in the same state, they get stuck together.

This guarantees that $\Pr[X_{t+1} \ne Y_{t+1}]$ can only decrease. So we know that:

the variation distance $\Delta_x(t+1)$ is the smallest probability with which two random variables distributed as $p_x^{(t+1)}$ and $\pi$ can be unequal.
$\Pr[X_{t+1} \ne Y_{t+1}]$ is a specific instance of that probability: $X_{t+1}$ and $Y_{t+1}$ are two random variables, somehow related, with the $p_x^{(t+1)}$ and $\pi$ distributions.
$\Pr[X_{t+1} \ne Y_{t+1}]$ decreases from $\Pr[X_t \ne Y_t] = \Delta_x(t)$.

We conclude that $\Delta_x(t+1) \le \Delta_x(t)$.

Trying to Understand Coupling Arguments Through a Simple Markov Chain Example

There are 1 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in MARKOV-CHAINS

Related Questions in INTUITION

Related Questions in COUPLING

Trending Questions

Popular # Hahtags

Popular Questions