Why left multiplication when it comes to Markov chains?

Question

Why left multiplication when it comes to Markov chains?

1.7k Views Asked by Bumbble Comm At 01 Apr 2026 - 3:39

When working with Markov chains and transition matrices $P$ we multiply from the left, meaning that for example $\mu^{(n)} = \mu^{(0)}P^n$ or that the stationary distribution satisfies $\pi = \pi P$. Especially for the stationary distribution this means that $\pi$ is a left eigenvector of the transition matrix $P$.

Why do we do this left multiplication? Is it just a convention or are there any other reasons why this is done? I couldn't think of an intuitive explanation.

In my opinion, it seems more intuitive to do $\mu^{(n)} = P'^n\mu{(0)}$ and treat $\pi$ as a (normal/right) eigenvector $\pi = P' \pi$. It seems to me that, in this case, $P' = P^t$.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

It is indeed quite confusing, also taking into account that we often write $$ \int P(x,A)\mu(\mathrm dx) $$ even though some may prefer $\int \mu(\mathrm dx)P(x,A)$ - in the latter case I'm often confused where does the integral end. Nevertheless, it seems that the notation for measures and kernels comes from the fact that functions over finite spaces are usually treated as column vectors, and measures as row vectors. In such case of course you need to write $\mu P$ for the action of $P$ on measures and $P f$ for its action on functions. Also, $\mu f = \int f\mathrm d\mu$.

The fact that functions and measures are column and row vectors over finite spaces perhaps has something to do with convenience of co- and contravariance of such representations for finite Markov Chains, but it can easily be just a tradition as Tunococ has mentioned.

One more point: there is another product of measures and kernels that is $\mu\otimes P$ which is in your case a joint distribution of the first two coordinates of a Markov Chain. I find it certainly convenient to write $\mu$ to the left of $P$ since we also write $x_0,x_1,\dots$ and not vice-versa for the coordinates of Markov Chains. Perhaps, this might have been a reason from introducing $\mu P$ with the same order of arguments: note that $\mu P(\cdot) = (\mu \otimes P)(X\times \cdot)$. Yet again $$ (\mu\otimes P)(A\times B) = \int_A P(x,B)\mu(\mathrm dx) $$ where $\mu$ is to the right in the RHS, so unless you are used to write integrals as $\int \mu(\mathrm dx)P(x,A)$ you may find some notation not really consistent in measure theory and probability.

Why left multiplication when it comes to Markov chains?

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in MARKOV-CHAINS

Trending Questions

Popular # Hahtags

Popular Questions