When working with Markov chains and transition matrices $P$ we multiply from the left, meaning that for example $\mu^{(n)} = \mu^{(0)}P^n$ or that the stationary distribution satisfies $\pi = \pi P$. Especially for the stationary distribution this means that $\pi$ is a left eigenvector of the transition matrix $P$.
Why do we do this left multiplication? Is it just a convention or are there any other reasons why this is done? I couldn't think of an intuitive explanation.
In my opinion, it seems more intuitive to do $\mu^{(n)} = P'^n\mu{(0)}$ and treat $\pi$ as a (normal/right) eigenvector $\pi = P' \pi$. It seems to me that, in this case, $P' = P^t$.
It is indeed quite confusing, also taking into account that we often write $$ \int P(x,A)\mu(\mathrm dx) $$ even though some may prefer $\int \mu(\mathrm dx)P(x,A)$ - in the latter case I'm often confused where does the integral end. Nevertheless, it seems that the notation for measures and kernels comes from the fact that functions over finite spaces are usually treated as column vectors, and measures as row vectors. In such case of course you need to write $\mu P$ for the action of $P$ on measures and $P f$ for its action on functions. Also, $\mu f = \int f\mathrm d\mu$.
The fact that functions and measures are column and row vectors over finite spaces perhaps has something to do with convenience of co- and contravariance of such representations for finite Markov Chains, but it can easily be just a tradition as Tunococ has mentioned.
One more point: there is another product of measures and kernels that is $\mu\otimes P$ which is in your case a joint distribution of the first two coordinates of a Markov Chain. I find it certainly convenient to write $\mu$ to the left of $P$ since we also write $x_0,x_1,\dots$ and not vice-versa for the coordinates of Markov Chains. Perhaps, this might have been a reason from introducing $\mu P$ with the same order of arguments: note that $\mu P(\cdot) = (\mu \otimes P)(X\times \cdot)$. Yet again $$ (\mu\otimes P)(A\times B) = \int_A P(x,B)\mu(\mathrm dx) $$ where $\mu$ is to the right in the RHS, so unless you are used to write integrals as $\int \mu(\mathrm dx)P(x,A)$ you may find some notation not really consistent in measure theory and probability.