Equivalent statements about transition matrix of a Markov chain

524 Views Asked by At

Let $P=(p_{ij})_{i,j\in E}$ be a transition matrix and $E$ of finite cardinality. Show that the following three conditions are equivalent: (i) $p$ is irreducible and aperiodic. (ii) $P^n$ is irreducible. (iii) There exists a $n\in\mathbb{N}$ such that $p_{ij}^{(n)}>0$ for every $i,j\in E$.

Edit

In a book I found an understandable proof for (i$)\implies$ (iii), see here.

To be honest, I do not know how to prove the rest and so I need your help, please.

I do not know which implications hold, too and - what is the biggest problem - how to prove them in order to have in the end, that all three statements are indeed equivalent.

So it would be great to get help from you.

1

There are 1 best solutions below

1
On BEST ANSWER

Let's first introduce the terms and definitions together with some motivation we need in order to prove the equivalency of the statements.

Irreducibility and aperiodicity are conditions of central importance in Markov theory, since they play a key role in the study of stationary distributions.

Irreducibility is, loosely speaking, the property that all states of the Markov chain can be reached from all others. More precise:

Let $E=\{1,\ldots,k\}$ the state space and $P=(p_{i,j})_{i,j\in E}$ the transition matrix of a Markov chain $(X_n)_{n\geq 0}$. We say that a state $i$ communicates with another state $j$, symbolically $i\longrightarrow j$, if the chain has positive probability of ever reaching $i$ when we start from $j$. In other words, $i$ communicates with $j$, if there exists an $n$ such that $$P(X_{m+n}=i|X_m=j)>0$$ If $i\longrightarrow j$ and $j\longrightarrow i$, then we say the states $i$ and $j$ communicate, and write $$i\longleftrightarrow j$$.

This takes us directly to irreducibility:

A Markov chain $(X_n)_{n\geq 0}$ with state space $E=\{1,\ldots,k\}$ and transition matrix $P$ is said to be irreducible if for all $i,j\in E$ we have that $i\longleftrightarrow j$. Otherwise the chain is said to be reducible.

Note 1: An easy way to verify that a Markov chain is irreducible is to look at its transition graph, and check that from each state there is a sequence of arrows leading to any other state.

Note 2: Communication $i\longleftrightarrow j$ of states $i,j$ is an equivalence relation on $E$ and thus partitions $E$ into communication classes. So, a Markov chain or transition matrix $P$ is called irreducible if $E$ consists of a single class. This also explains the term reducible, since the analysis of the long-term behavior of a Markov chain can then be reduced to the analysis of the long-term behavior of one or more Markov chains with smaller state space.

Next let's consider the concept of aperiodicity.

Aperiodicity: The period $d(i)$ of a state $i\in E$ is defined as the greatest common divisor of the set of times that the chain can return (i.e., has positive probability of returning) to $i$, given that we start with $X_0=i$. If $d(i)=1$, then we say that the state $i$ is aperiodic. A Markov chain is said to be aperiodic, if all its states are aperiodic. Otherwise the chain is said to be periodic.

Note 3: Consider for instance a weather situation. It's easy to check that regardless of whether the weather today is rain or sunshine, we have for any $n$ that the probability of having the same weather $n$ days later is strictly positive. This is clearly a situation of a Markov chain which is aperiodic.

Note 4: Period is a class property: *If the states $i$ and $j$ communicate, then they have the same period. (see e.g. Theorem 4.2 from Markov Chains - Gibbs Fields, Monte Carlo Simulation and Queues from Pierre Brémaud). We can therefore speak of the period of a communication class or of the period of an irreducible Markov chain.

Note 5: For each communication class the period $d$ of it defines a unique partition of $E$ into $d$ classes $C_0, C_1, \ldots, C_{d-1}$ such that for all $k$ with $i\in C_{k}$, $$\sum_{j\in C_{k+1}}p_{i,j}=1,$$ where by convention $C_d=C_0$, and where $d$ is maximal. (see e.g. Theorem 4.1 in Pierre Brémaud's book).

We further assume that the following theorem about aperiodic Markov chain is known: (see e.g. Theorem 4.1 from Finite Markov Chains and Algorithmic Applications)

Theorem 1: Suppose that we have an aperiodic Markov chain $(X_n)_{n\geq 0}$ with state space $E=\{1,\ldots,k\}$ and transition matrix $P$. Then there exists an $N<\infty$ such that $$p^{(n)}_{i,i}>0$$ for all $i\in\{1,\ldots,k\}$ and all $n\geq N$.


The Proposition:

I assume the proposition is slightly different as it is stated by the OP. So, here I prove the following:

The following statements are equivalent:

(i) A Markov chain $(X_n)_{n\geq 0}$ with transition matrix $P=(p_{i,j})_{i,j\in E}$ is irreducible and aperiodic

(ii) There exists an $n\in\mathbb{N}$ such that $P^n$ is irreducible for all $n \geq N$

(iii) There exists an $N\in\mathbb{N}$ such that $p^{(n)}_{i,j}>0$ for every $i,j\in E$ and for all $n \geq N$


The Proof:

(i) $\Rightarrow$ (iii)

By the assumed aperiodicity and theorem 1 above, there exists an integer $N<\infty$ such that $p^{(n)}_{i,i}>0$ for all $i\in\{1,\ldots,k\}$ and all $n\geq N$. Fix two states $i,j\in E$. By the assumed irreducibility, we can find some $n_{i,j}$ such that $p^{(n_{i,j})}_{i,j}>0$. Let $M_{i,j}=N+n_{i,j}$. For any $m\geq M_{i,j}$, we have

\begin{align*} P(X_m=j|X_0=i)&\geq P(X_{m-n_{i,j}}=i,X_m=j|X_0=i)\\ &=P(X_{m-n_{i,j}}=i|X_0=i)P(X_{m}=j|X_{m-n_{i,j}}=i)\tag{1}\\ &>0 \end{align*}

The first factor in (1) is positive because $m-n_{i,j}\geq N$, and the second is positive by the choice of $n_{i,j}$. Hence, we have shown that $p^{(m)}_{i,j}>0$ for all $m\geq M_{i,j}$. The claim now follows with $$M=max\{M_{1,1},M_{1,2},\ldots,M_{1,k},M_{2,1},\ldots,M_{k,k}\}.$$ (iii) $\Rightarrow$ (ii)

This is only a rephrasing of (iii).

So, let's assume (iii) is valid. Then for $n\geq N$ and for any $i,j\in E$ $$p^{(n)}_{i,j}>0\quad\text{and}\quad p^{(n)}_{j,i}>0\qquad\Longrightarrow\qquad p^{(n)}_{i,j}\longleftrightarrow p^{(n)}_{j,i}$$ Since all $p^{(n)}_{i,j}$ are in only one communciation class $P^n$ is irreducible and (ii) follows.

(ii) $\Rightarrow$ (i)

We show $\neg (\text{i}) \Rightarrow \neg (\text{ii})$.

Let's first assume $P$ is reducible. This implies according to note 2 above, that the number of partitions of communication classes is greater than one. So, for any $i,j\in E$ which are in different communication classes $p^{(n)}_{i,j}=0$ for all $n\geq N(\in \mathbb{N})$ and $\neg(\text{ii})$ follows.

Suppose now that $P$ is irreducible but has period $d>1$. In this case the state space splits according to note 5 into $d$ sets, $C_0,\ldots,C_{d-1}$, such that the Markov chain always moves from $C_k$ to $C_{k+1}$ (or $C_{d-1}$ to $C_0$).

The transition matrix $P$ has a block structure with blocks $A_0$ to $A_{d-2}$ above the main diagonal and a block $A_{d-1}$ at the lower left side of $P$. Each of the blocks $A_k$ corresponds to the transitions from the cycle $C_k$ to $C_{k+1}$.

Observe, that each further multiplication with $P$ implies a block-shifting until we reach $P^d$, which has a block-diagonal form corresponding to the cyclic classes $C_0,C_1,\ldots,C_{d-1}$. Therefore the $P$-cylic classes $C_0,C_1,\ldots,C_{d-1}$ are all in different $P^d$-communication classes as the diagonal block structure shows. Since this diagonal block-structure holds for all powers of $P^{d}$ $\neg(\text{ii})$ follows.


Note: Most of the notes above and the proof for (i) $\Rightarrow$ (iii) come from Olle Häggström's nice (and small) book Finite Markov Chains and Algorithmic Applications.

The second part of the proof from (ii) $\Rightarrow$ (i) (namely assuming period $d>1$) can be found in chapter I, section 4.2 in Pierre Brémaud's Markov Chains - Gibbs Fields, Monte Carlo Simulation and Queues.