Proof that steady state is not affected by initial distribution in Markov chain.

Question

Proof that steady state is not affected by initial distribution in Markov chain.

1.3k Views Asked by Bumbble Comm At 01 Apr 2026 - 11:02

I was following a proof provided in Gilbert Strang's book "Introduction to Linear Algebra". And I am confused by one step of the proof.

Suppose we have a $n$ by $n$ stochastic matrix $A$, where all elements are not negative and sum of every element in each row is $1$.

There exists a proof that there is only one biggest eigenvalue of $A$ equal to $1$ and other eigenvalues are less then $1$. I found it there: Proof that the largest eigenvalue of a stochastic matrix is 1

Now I want to get a proof that Markov chain has a steady state that is not affected by initial probability distribution:

$$ u_0 = \left( \begin{smallmatrix} u_1 \\ u_2 \\ \ldots \\ u_n \end{smallmatrix} \right) $$

Let's apply diagonalization to our matrix $A$:

$$ A = S \Lambda S^{-1} $$

Where $S$ consists of eigenvectors placed there as columns, $\Lambda$ is a diagonal matrix with corresponding eigenvalues.

Suppose we want to get a presentation of our initial distribution as a linear combination of eigenvectors of $A$ like this:

$$ u_0 = c_1 x_1 + c_2x_2 + \ldots + c_nx_n $$

In a matrix form:

$$u_o = SC$$

We can get $C$ from:

$$C = S^{-1}u_0$$

So, when we apply A to $u_0$ multiple times ($k$ times):

$$ u_k = A \ldots Au_o = S \Lambda S^{-1} \ldots S \Lambda S^{-1} u_o = S \Lambda S^{-1} \ldots S \Lambda S^{-1} SC = S \Lambda^{k} C$$

So, literally we get:

$$ u_k = c_1(\lambda_1)^kx_1 + c_2(\lambda_2)^kx_2 + \ldots + c_n(\lambda_n)^kx_n $$

But then author writes this in the book:

$$ u_k = x_1 + c_2(\lambda_2)^kx_2 + \ldots + c_n(\lambda_n)^kx_n $$

I understand that the author omits $\lambda_1$ because it is equal to 1. Why does the author omit $c_1$?

EDIT: I found out that $c_1$ is equal to 1. But I don't know why. This is why the author omits it.

Later in his proof the author shows that:

$$ \lim_{k\rightarrow \infty } u_k = x_1 $$

So, the author concludes that steady state is equal to the eigenvector with corresponding eigenvalue of 1.

Original Q&A

There are 3 best solutions below

**Bumbble Comm** · Answer 1 · 2014-06-01 20:36:38

This proof seems a little confusing to me...first of all, consider the stochastic matrix \begin{equation} \begin{bmatrix} 0&1 \\ 1&0 \end{bmatrix},\end{equation} which is diagonalizable, but which has no steady state.

So I think it must be specified that $A$ is regular, ie some power of $A$ has only positive entries.

So now assuming this is true, with regard to your question, if we have the coordinate vector of $u_0$ with respect to the standard basis as $\left( \begin{smallmatrix} u_1 \\ u_2 \\ \ldots \\ u_n \end{smallmatrix} \right)$, and since we know $S$ is the change of basis matrix taking vectors from the standard basis to the basis of eigenvectors, we must have \begin{equation} S\left( \begin{smallmatrix} u_1 \\ u_2 \\ \ldots \\ u_n \end{smallmatrix} \right)=\sum_{i=1}^n u_ix_i,\end{equation} so that in fact we must have $u_i=c_i$, and there is definitely no reason why we must have $u_1=1$.

A much better approach (imho) is to use the property that for a regular stochastic matrix $A$: $\lim_{m \rightarrow \infty}A^m=L$, is a matrix with every column equal to $x_1$, the eigenvector associated with eigenvalue $1$. The existence of $L$ follows in part from the Perron-Frobenius theorem, or equivalently, the link you have posted and some other results - I am not going to prove that here - so if we can accept that $L$ indeed exists, then we need two things - one: \begin{equation} AL=A\lim_{m \rightarrow \infty}A^m=\lim_{m \rightarrow \infty}AA^m=\lim_{m \rightarrow \infty}A^{m+1}=L,\end{equation} and secondly since $AL=L$ every column of $L$ is an eigenvector of $A$ with associated with eigenvalue $1$.

So having established the above, using $x_1$ as in your notation, we have \begin{equation}(\lim_{m \rightarrow \infty}A^m)u_0=Lu_0=u_1x_1+u_2x_2+\cdots+u_nx_1=(u_1+u_2+\cdots+u_n)x_1=x_1,\end{equation} (since $u_0$ is a probability vector its entries sum to 1).

**user940** · Answer 2 · 2014-06-01 21:20:05

Start with the equation $Ax_j=\lambda_j x_j$ and left multiply by the row vector of all ones. This gives $1^T A x_j = 1^T \lambda_j x_j$, or $1^T x_j =\lambda_j 1^T x_j$. For $j\geq 2$, we have $\lambda_j\neq 1$ and deduce that $1^T x_j=0$.

Now left multiply the equation $u_0 = c_1 x_1 + c_2x_2 + \cdots + c_nx_n$ by $1^T$ to deduce that $c_1=1$. At least this is true if we choose the eigenvector $x_1$ so that $1^T x_1=1$.

**Bumbble Comm** · Answer 3 · 2022-10-22 15:00:37

There is something wrong in your question, $c_1=1$ is a special case for the example matrix $\begin{bmatrix}.80&.05\\.20&.95\end{bmatrix}$ in Strang's book. But assume that $\lambda_1=1$, the corresponding $c_i$ must be a CONSTANT for any initial state $u_0$, like 1 of $\begin{bmatrix}.80&.05\\.20&.95\end{bmatrix}$.

Proof

The answer of 'user940' is almost right, but he made a little mistake in the last step, maybe misled by the question.

The right last step is $$ \begin{aligned} 1^T u_0 &= 1^T c_1 x_1 + 1^T c_2 x_2 + \dots + 1^T c_n x_n \\&=1^Tc_1x_1 \quad (\text{ for } 1^T x_j=0 \text{ if } j\neq1) \end{aligned} $$ so, $$ \begin{aligned} c_1&=\frac{1^Tu_0}{1^Tx_1} \\ &=\frac{1}{1^Tx_1} \quad (\text{ for } u_0 \text{ is a probability distribution }) \end{aligned} $$

So, for any initial state $u_0$, the steady state is $$ \lim_{x \to \infty}{u_k}= c_1 x_1 = \frac{x_1}{1^T x_1} $$

One word more, in the example matrix $\begin{bmatrix}.80&.05\\.20&.95\end{bmatrix}$ in Strang's book. $x_1=\begin{bmatrix} .2 \\ .8 \end{bmatrix}$, so $c_1=1$

Proof that steady state is not affected by initial distribution in Markov chain.

There are 3 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MARKOV-CHAINS

Trending Questions

Popular # Hahtags

Popular Questions