Given a state transition matrix for Markov Chain, how to get the distribution of state over time?

1.3k Views Asked by At

Suppose that we have a Markov chain with two states: UP or DOWN. Its state transition matrix as the following:

${\displaystyle T={\begin{bmatrix}a&b\\c&d\end{bmatrix}} }$

Consider probability $S(n)$, which denotes the number of times that the chain is in state UP in the first n steps.

How to calculate $S(n)$ ?

For example,

let n = 100,

S(100) will be probability that all 100 steps, the chain is in UP state,

S(99) will be the probability that 99 steps, the chain is in UP state,

:

S(1) will be the probability that only 1 step, the chain is in UP state

How to calculate $S(n)$ ?

1

There are 1 best solutions below

3
On

Based on your question, it is very possible that you misunderstand things about Markov chains as well as the difference between random variables and their probabilities. Thus it is difficult for me to know what you are actually looking for as an answer. For general discrete Markov chains the distribution of the number of visits to a given state in the first $n$ transitions $S(n)$ is highly difficult to compute and would have a disgusting formula. However it turns out, for this simple example, there is a nice(ish) closed form for the distribution of $S(n)$.


In the following I am assuming the transition matrix is \begin{equation} T = \begin{pmatrix} p & 1-p \\ 1-p & p \end{pmatrix}. \end{equation} In the following I am also renaming the states to $A := \text{UP}$ and $B:= \text{DOWN}$, this is just to make the formatting easier to read. So $p = P_{A,A} = P_{B,B}$ and $1-p = P_{A,B} = P_{B,A}$. In the following I will also assume that $X_0 = A$.

Let $S(n)$ be the number of times state $A$ is visited in $n$ transitions. Note that a sequence of $n$ transitions can be represented as a length $n$ sequence of $A$'s and $B$'s and so $S(n)$ can be considered as a random variable over binary sequences of length $n$. Consider the length $6$ sequence $X_0ABABB$. Since there are $3$ total transitions from $A$ to $B$ or from $B$ to $A$ and $2$ transitions from $A$ to $A$ or from $B$ to $B$, this binary sequence occurs with probability $p^2(1-p)^3$. Thus we have \begin{equation} P(S(n) = k+1\,|\, X_0 = A) = \sum_{i=0}^{M} c^i_{n,k} p^{n-1-i}(1-p)^{i}, \end{equation} where $c^i_{n,k}$ is the number of possible sequences of length $n$ with $k+1$ $A$'s where there are a total of $i$ transitions between $A$ and $B$, and $M$ is the maximum number of possible transitions between $A$ and $B$ for given $n$ and $k+1$. In order to compute $c_i^k$ we need to use some combinatorial argument.

Since we assumed that $X_0 = A$, if a binary sequence ends in a $B$ then the number of transitions between $A$ and $B$ is odd. And alternatively, if the binary sequence ends in a $A$ then number of transitions between $A$ and $B$ is even. For a given sequence we define a grouping of $B$s as a contiguous subsequence beginning with a $A$ followed by contiguous $B$'s. For example, in the sequence $X_0ABBAAAB$ there are two groupings of $B$'s, $ABB$ and $AB$. We split the problem into two cases:

Case 1, sequence ends in B: If the number of transitions between $A$ and $B$ is $2m+1$ then there are $m+1$ groupings of $B$'s. So $c^i_{n,k}$ is the number of length $n$ binary sequences with $m+1$ runs of $B$'s with $n - k-1$ $B$'s. In this setting we have $\max(k+1-m-1,0)$ $A$'s that are not in a run of $B$'s.

For example, if we have the sequence $X_0BBAAB$ we have the groupings $X_0BB$ and $AB$ with one $A$ remaining.

The number of ways to get $m+1$ groupings of $n-k-1$ $B$'s is the number of compositions of $n-k-1$ into $m+1$ parts, $\binom{n-k-2}{m}$. The number of ways to place the remaining $\max(k-m,0)$ $A$'s into the $m+1$ bins (note that we are considering sequences ending in a $B$ and thus there are $m+1$ bins in which we can place the $A$'s) is the number of weak compositions of $\max(k-m,0)$ into $m+1$ parts, $\binom{k-m+m+1-1}{m} = \binom{k}{m}$.

Thus we have $c^{2m+1}_{n,k} = \binom{n-k-2}{m}\binom{k}{m}$.

Case 2, sequence ends in A: Similar to case 1. The only difference being that there are $m$ groupings of $B$'s and so there are $\max(k+1-m,0)$ remaining $A$'s that can be placed into $m+1$ bins.

Thus we have $c^{2m}_{n,k} = \binom{n-k-2}{m-1}\binom{k}{m}$.

Note that these formula fail if $k+1 = n$ or $m=0$. However, if the number of transitions between $A$ and $B$ is $0$ then we must have $k+1 = n$ which occurs with probability $P(S(n) = n) = p^{n-1}$. Thus for $n > k+1$ we have, \begin{align} P(S(n) = k+1\,|\, X_0 = A) = \sum_{i=1}^{M} c^i_{n,k} p^{n-1-i}(1-p)^{i}, \end{align} where $M = \min(2k+1,2(n-k-1))$ and \begin{equation} c^{i}_{n,k} = \begin{cases} \binom{n-k-2}{m}\binom{k}{m}, & i \text{ is odd}, \\ \binom{n-k-2}{m-1}\binom{k}{m}, & i \text{ is even}. \end{cases} \end{equation} This can be confirmed experimentally by computing the empirical distribution by sampling from the Markov chain. E.g. for $n=10$ and for $p=0.4$ we get the distribution: enter image description here


For more general discrete time Markov chains you can probably see that these calculations become intractable. However for large $n$ (under suitable assumptions on the Markov chain) the distribution of $S(n)$ will become approximately normal. The things that you could then consider would be the expected number of visits and the variance of the number of visits.