Markov chains: disconnected versus reducible

122 Views Asked by At

Until recently, I was under the mistaken impression that a reducible Markov chain is one whose state space can be partitioned into two non-communicating sets, i.e. a disconnected Markov chain. However, to my understanding, the common definition of reducible is more general, and includes the possibility of two or more non-communicating sets connected by transient states. An example is the chain with states $\{ A, B, C \}$ where $A$ and $C$ are absorbing and $B$ leads to both. To be precise, a chain is irreducible if any two states lead to each other, and is reducible otherwise.

Is there any benefit to distinguishing between connected versus disconnected chains, or does the reducible/irreducible split capture everything of real interest? I know as far as the Ergodic theorem, reducibility is more relevant, but it feels like there should be some value to considering connectivity.

That said, I assume that irreducibility is generally more relevant than reducibility, and irreducibility is a stronger condition than connectedness. Is perhaps connectedness too weak a condition for anything of interest?

1

There are 1 best solutions below

0
On BEST ANSWER

As pointed out by Josh, there is an important distinction between strong and weak connectivity. We can define two states $x,y$ to be weakly connected when either $x$ leads to $y$, $y$ leads to $x$, or both, and define the states to be strongly connected if both $x$ leads to $y$ and $y$ leads to $x$. Under this definition, an irreducible chain is exactly a strongly connected chain (every state strongly connected to every other).

The relationship of weak connectivity is a rather poorly behaved one, in that it isn't even transitive. For instance, for the 3-state example in the question, $B$ is weakly connected with both $A$ and $C$, but $A$ and $C$ are not weakly connected to each other. Strong connectivity doesn't have this issue, and is an equivalence relation (as long as we treat each state as being strongly connected to itself).

Additionally, as pointed out in the question, assuming strong connectivity gives the properties that we want from a chain -- especially that it has a unique stationary distribution. For a weakly connected chain, we can't even guarantee the existence of a stationary distribution. For instance, consider a chain with states being the positive integers and transition probabilities such that the chain always moves from $n$ to $n+1$ (i.e. $p(n,m) = I[m=n+1]$ for all $n$). No matter the starting state, the chain moves off to infinity.

Finally, it is worthwhile to notice that, as far as long term behavior, the transient states are generally pretty irrelevant. Essentially, no matter where the chain starts, its going to end up moving around one of the strongly connected "components" of the graph. Removing these transient states leads to a chain with only irreducible/strongly connected components. Then the various notions of connectivity and irreducibility coincide, and the problem vanishes. [This doesn't really make sense for the special case when the whole chain is transient states, but it is meant primarily for intuition anyway.]