Proving a Trick to More Quickly Calculate N-Step Transition Probabilities

337 Views Asked by At

So, I have been working on a homework problem all day that asks me to prove that: $P^n= \Pi +Q^n$ where P is the transition matrix of a finite-state regular Markov Chain, $\Pi$ is a matrix whose rows are the stationary distribution of P, and $Q = P - \Pi$.

The fundamental relation that I have been attempting to show is that $(P-\Pi)^n=P^n-\Pi$. I'm not sure if there is a relevant matrix product property here that would allow me to establish this relation but I can't figure out how to proceed. Also, for those who have math backgrounds but not necessarily statistics backgrounds, the stationary distribution is just a left eigenvector with eigenvalue 1, and P is a stochastic matrix (rows summing to 1).

Originally, I was attempting to rewrite the matrix product relation in sigma notation and try to achieve some sort of simplification. I didn't find anything by doing that, so here is my most current attempt:

$P^n= \Pi +Q^n$

Substituting in for Q to get to the relation I referred to earlier:

$P^n=\Pi+(P-\Pi)^n$

$P^n-\Pi=(P-\Pi)^n$

Then, I use the property that the eigenvalue of $\Pi$ is 1 to rewrite P as $\Pi*P$.

$(\Pi*P-\Pi)^n=P^n-\Pi$

$\Pi^n*(P-1)^n=P^n-\Pi$

From here, I am unsure how to proceed but believe that this is a natural way to use the eigenvalue to alter the relation.

2

There are 2 best solutions below

6
On BEST ANSWER

I understand matrix $\Pi$ to be:

$$ \Pi = \begin{bmatrix} \pi_0 & \pi_1 & \cdots & \pi_m \\ \pi_0 & \pi_1 & \cdots & \pi_m \\ \cdots & & & \\ \pi_0 & \pi_1 & \cdots & \pi_m \\ \end{bmatrix} $$

The property of $\Pi$ that: $\;\Pi P = P\Pi = \Pi\;\Pi = \Pi\;$ is easy to prove. We have

$$Q^n = (P-\Pi)^n$$

and if we expand that expression, using the above property we get much simplification (any term containing at least one factor of $\Pi$ reduces to $\pm\Pi$) and our expansion finishes up with the following terms:

\begin{align} 2^{n-1} & \;\text{ instances of } -\Pi \\ 2^{n-1}-1 & \;\text{ instances of } \Pi \\ 1 & \;\text{ instance of } P^n. \end{align}

So we get

$$\Pi + Q^n = \Pi + (-\Pi + P^n) = P^n.$$


Proof of above property:

  1. $\Pi P=\Pi$: this is part of the definition of a stationary distribution.

  2. $P\Pi = \Pi$: the $(i,j)^{th}$ entry of $P\Pi$ by matrix multiplication is:

$$\sum_{k} p_{ik}\cdot\pi_j = \pi_j\sum_{k} p_{ik} = \pi_j = (i,j)^{th} \text{ entry of } \Pi.$$

  1. $\Pi\;\Pi = \Pi$: the $(i,j)^{th}$ entry of $\Pi\;\Pi$ by matrix multiplication is:

$$\sum_{k} \pi_{k}\cdot\pi_j = \pi_j\sum_{k} \pi_{k} = \pi_j = (i,j)^{th} \text{ entry of } \Pi.$$

0
On

$Comment:$ Related computation in R, demonstrating the eigenvector method of finding the limiting distribution of a Markov chain with a finite state space that consists of a single intercommunicating class. The limiting distribution is also the stationary distribution.

Note: This is $not$ an answer to the original question, but may be related to a question in a comment. I suppose a similar computation can be done in Matlab.

$P$ is a $6 \times 6$ transition matrix; it could be defined in a single row, but the entry format used makes it easier to read.

 P  = (1/40)*matrix(c( 0, 20,  5,  5,  5,  5,
                       8,  0,  8,  8,  8,  8,
                      10,  0,  0, 10, 10, 10,
                      10,  0, 10,  0, 10, 10,
                      10,  0, 10, 10,  0, 10,
                      10,  0, 10, 10, 10,  0), nrow=6, byrow=T)
 P
 ##      [,1] [,2]  [,3]  [,4]  [,5]  [,6]
 ## [1,] 0.00  0.5 0.125 0.125 0.125 0.125
 ## [2,] 0.20  0.0 0.200 0.200 0.200 0.200
 ## [3,] 0.25  0.0 0.000 0.250 0.250 0.250
 ## [4,] 0.25  0.0 0.250 0.000 0.250 0.250
 ## [5,] 0.25  0.0 0.250 0.250 0.000 0.250
 ## [6,] 0.25  0.0 0.250 0.250 0.250 0.000

  g = eigen(t(P))$vectors[,1]      # t=transpose for row (not column) eigenvector
  lim.dist = g/sum(g)              # to make vector sum to 1
  lim.dist = as.numeric(lim.dist)  # kill needless complex number notation

R prints a matrix of eigenvectors in decreasing order of moduli, we want the one with the largest modulus, hence the notation [,1] above (to retrieve the first column of that matrix). In general, eigenvectors may have complex elements; this one has only real elements. The command as.numeric simplifies the output.

 g
 ## [1] -0.4719292+0i -0.2359646+0i -0.4247363+0i -0.4247363+0i
 ## [5] -0.4247363+0i -0.4247363+0i

 lim.dist
 ## 0.19607843 0.09803922 0.17647059 0.17647059 0.17647059 0.17647059

Check that we have the correct answer so that $\lambda P = \lambda$. [Also, $P^8$ has all rows (very nearly) equal to $\lambda.$]

 lim.dist %*% P   # '%*%' denotes matrix multiplication in R
 ##      [,1]       [,2]      [,3]      [,4]      [,5]      [,6]
 ## 0.1960784 0.09803922 0.1764706 0.1764706 0.1764706 0.1764706