Reference request: convergence for periodic Markov chains

Question

Reference request: convergence for periodic Markov chains

603 Views Asked by Bumbble Comm At 05 Apr 2026 - 12:06

Consider an irreducible but possibly periodic Markov chain on a finite state space with transition matrix $P$. We know there exists a unique stationary distribution $\pi$. If the Markov chain were aperiodic, we would have $P^n_{ij} \to \pi(j)$ as $n \to \infty$. This fails if the chain is periodic, but we have convergence of the Cesaro averages: $$\frac{1}{n}\sum_{k=1}^n P^k_{ij} \to \pi(j) \text{ as } n \to \infty.$$ Can anyone point me to a reference that states this fact? Every reference I've seen only considers convergence for aperiodic chains, or "fixes" periodicity by considering a lazy version of the chain. Alternatively, is there a simple way to obtain this result using the result for aperiodic chains?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 04 Jun 2020 - 12:04

The result follows immediately from applying the elementary renewal theorem to delayed renewal processes.

Here's a more elementary algebraic proof, using telescoping.
(problem 16, page 468 of Grinstead and Snell's free book https://math.dartmouth.edu/~prob/prob/prob.pdf )

for stochastic, $\text{m x m}$ matrix $P$

$\mathbf \pi^T P = \mathbf \pi^T$ and $ P\mathbf 1 = \mathbf 1$,
$W:= \mathbf 1 \mathbf \pi^T$ and $\text{trace}\big(W\big) = 1$

consider the following telescope
$\Big(I+P+P^2+....+ P^{n-1}\Big)\Big(I-P+W\Big) = I -P^n +nW$

thus
$\frac{1}{n}\Big(I+P+P^2+....+ P^{n-1}\Big)$
$= \frac{1}{n}\big(I -P^n +nW\big)\Big(I-P+W\Big)^{-1} $
$= \frac{1}{n}\Big\{\Big(I-P+W\Big)^{-1}\Big\} -\frac{1}{n}\Big\{P^n\Big(I-P+W\Big)^{-1}\Big\} +\frac{1}{n}\Big\{nW\Big(I-P+W\Big)^{-1}\Big\} $
$= \frac{1}{n}\Big(I-P+W\Big)^{-1} -\frac{1}{n}P^n\Big(I-P+W\Big)^{-1} +W$

now pass limits

$\lim_{n\to\infty}\Big\{ \frac{1}{n}\Big(I-P+W\Big)^{-1} -\frac{1}{n}P^n\Big(I-P+W\Big)^{-1} +W\Big\}$
$= \Big\{\lim_{n\to\infty}\frac{1}{n}\Big(I-P+W\Big)^{-1}\Big\} -\Big\{\lim_{n\to\infty} \frac{1}{n}P^n\Big(I-P+W\Big)^{-1}\Big\} +W$
$=0+0+W$

so
$W=\lim_{n\to\infty} \frac{1}{n}\Big(I+P+P^2+....+ P^{n-1}\Big)$

That's the argument in its entirety. I've left three book-keeping details for the end.

re: the third term simplification $W\Big(I-P+W\Big)^{-1}=W$
suppose $\Big(I-P+W\Big)^{-1}$ exists, then consider the inverse problem
$W\Big(I-P+W\Big) = W-WP +W^2 = W-W + W = W$
now multiply both sides on the right by $\Big(I-P+W\Big)^{-1}$

re: the second limit
observe that
$\Big\Vert \frac{1}{n}P^n\Big(I-P+W\Big)^{-1} - \mathbf 0\Big\Vert_F$
$ = \frac{1}{n}\Big\Vert P^n\Big(I-P+W\Big)^{-1}\Big\Vert_F$
$ \leq \frac{1}{n}\Big\Vert P^n\Big\Vert_F\cdot \Big\Vert \Big(I-P+W\Big)^{-1}\Big\Vert_F $
$ \leq \frac{1}{n} \mathbf 1^T P^n \mathbf 1 \cdot \Big\Vert \Big(I-P+W\Big)^{-1}\Big\Vert_F $
$ = \frac{1}{n} \mathbf 1^T \mathbf 1 \cdot \Big\Vert \Big(I-P+W\Big)^{-1}\Big\Vert_F $
$ = \frac{m}{n} \cdot \Big\Vert \Big(I-P+W\Big)^{-1}\Big\Vert_F$
$ \lt \epsilon $
for large enough n
(The second to last inequality follows from triangle inequality)

re: the invertibility of $\Big(I-P+W\Big)$
we prove $\det\Big(I-P+W\Big)=\prod_{j=2}^n (1-\lambda_j)$ and hence the matrix is invertible.

the nicest proof involves (partial) symmetrization:
using Perron Frobenius theory, we know that $\lambda_1 =1 $ is simple since $P$ is irreducibile.

$\mathbf v_1 := \mathbf \pi^\frac{1}{2}\cdot \frac{1}{\big \Vert \mathbf \pi^\frac{1}{2}\big \Vert_2}$
(where the square root is interpretted to be taken component-wise)

diagonal matrix $D:=\text{diag}\big(\mathbf v_1\big)$

Consider the similar matrix
$D\Big(I-P+W\Big)D^{-1} = I- (DPD^{-1}) +DWD^{-1} = I - B + \mathbf v_1\mathbf v_1^T$

$B$ has $\mathbf v_1$ as its left and right eigenvectors (check!).
Working over $\mathbb C$ and applying Schur Triangularization to $B$:

$V := \bigg[\begin{array}{c|c|c|c}\mathbf v_1 & \mathbf v_2 &\cdots & \mathbf v_{n}\end{array}\bigg]$
$B = VRV^{-1} = VRV^{*} =V\begin{bmatrix} 1 & \mathbf x_{m-1}^*\\ \mathbf 0 & \mathbf R_{m-1} \end{bmatrix}V^* =V\begin{bmatrix} 1 & \mathbf 0^T\\ \mathbf 0 & \mathbf R_{m-1} \end{bmatrix}V^*$
note $\mathbf x_{m-1} = \mathbf 0$ because $ \mathbf v_1^T = \mathbf v_1^* =\mathbf v_1^* B = 1\cdot \mathbf v_1^* + \sum_{j} x_j\cdot \mathbf v_j^*$
and the columns of $\mathbf V$ (or rows of $\mathbf V^*$) are linearly independent so every $x_j =0$

By simplicity of the Perron root: $\mathbf R_{m-1}$ does not have eigenvalues of 1, so
$I -B + \mathbf v_1 \mathbf v_1^T = V\big(I-R + \mathbf e_1\mathbf e_1^T\big)V^{*} =V\begin{bmatrix} 1 & \mathbf 0^T \\ \mathbf 0 & I_{m-1} -\mathbf R_{m-1} \end{bmatrix}V^*$
hence the determinant is $1\cdot \prod_{j=2}^n (1-\lambda_j) \neq 0$.

**Bumbble Comm** · Accepted Answer

I'd like to present a different elementary approach. I'm omitting some details.

From irreducibility, for any pair of states $i,i'$, there exists $n_{i,i'}$ such that $p^{n_{i,i'}}_{i,i'}>0$. Therefore the probability of visiting $i'$ by time $n_{i,i'}$ or earlier, starting from $i$, is at least $p^{n_{i,i'}}_{i,i'}$. Let $\bar p = \min p^{n_{i,i'}}_{i,i'}>0$, and let $\bar n= \max n_{i,i'}<\infty$.

That's the key to everything.

Thus, regradless of the current state and entire past, the probability that the process will visit $i'$ at least once in the next $\bar n$ steps is at least $\bar p$. In particular, the probability that state $i'$ was not visited by time $L{\bar n}$ is bounded above by $(1-{\bar p})^L$. Let $\tau_{i'}$ be the first time the chain hits state $i'$. Then we showed $P_i(\tau_{i'}>L \bar n) \le (1-{\bar p})^L \to 0$. In particular $\tau_{i'}$ has finite expectation under $P_i$. That's true for all $i,i'$.

Write $S_n (i,j)= \sum_{k=0}^n p^k_{ij}$.

Then

\begin{align*} S_n(i,j)&= E_i[\sum_{k=0}^n {\bf 1}_{X_k}(j)] \\& \le E_i[ \sum_{k=0}^n {\bf 1}_{X_k} (j), \tau_{i'}<n] + (n+1) P(\tau_{i'}>n)\\ &\le E_i [\tau_{i'}]+ E_{i'} [\sum_{k=\tau_{i'}}^n {\bf 1}_{X_j}(j),\tau_{i'}\le n] +(n+1) P_i (\tau_{i'}>n)\\ & \le E_i [\tau_{i'}] + S_n (i',j) + (n+1) P_i(\tau_{i'}>n). \end{align*}

Therefore,

$$ \limsup_{n\to\infty} \frac{1}{n+1} (S_n (i,j) -S_n(i',j))\le 0.$$

Since this is true for all choices of $i,i'$, the limit exists and is equal to $0$.

Finally, suppose the states are $1,\dots,K$ and let $\pi$ be the stationary measure. Then

$$\pi(j) = \frac{1}{n+1} \sum_{i=1}^K \pi(i)S_n(i,j) =\frac{1}{n+1} S_n (1,j)+ \sum_{i'>1}\pi(i') \frac{1}{n+1} ( S_n (i',j) - S_n (1,j) ). $$

As the sum on the rightand tends to $0$, the result follows.

Reference request: convergence for periodic Markov chains

There are 2 best solutions below

Related Questions in REFERENCE-REQUEST

Related Questions in MARKOV-CHAINS

Trending Questions

Popular # Hahtags

Popular Questions