Why do Markov chains converge exponentially quick?

414 Views Asked by At

Let $G$ be a finite group, $q$ a probability mass function (pmf) on $G$ and $u$ the uniform pmf on $G$. We use the convolution product $$ q^{(k)}(\sigma) = \sum_{\alpha \beta =\sigma} q^{(k-1)}(\alpha)q(\beta) $$ and the $L^1$ distance between $q^k$ and $u$ $$ d(k) := || q^{(k)}-u || = \sum_{\sigma \in G} |q^{(k)}(\sigma) - u(\sigma)|. $$ Thus (see note below) $q^{(k)}$ is the distribution of the Markov chain on $G$ associated to $q$ at the $kth$ step and $d(k)$ is its rate of convergence.

Question: It is stated, as an observation here (Diaconis and Saloff-Coste, 1995, page 3), that $d$ is submultiplicative. This means that $d(n+m) \le d(n)d(m)$. Why is it true?

The observation is quite interesting: it implies an at least exponential rate of convergence (we can assume $q^{(k)}$ converges to $u$).


Notes.

  1. The Markov chain on $G$ associated to $q$ is defined as follows. Given a current state $x_n\in G$, the transition probability to $x_{n+1}$ is $q(x_n^{-1}x_{n+1})$. It can be checked that $q^{(k)}$ is the distribution of this chain at the $k$th step when starting at the identity.

  2. The problem could be formulated for general Markov chains, but I wanted to stick with the notations of the technical report.

1

There are 1 best solutions below

1
On BEST ANSWER

It suffices to show that for every distributions $p$ and $q$ on $G$, $$\sum_s\left|(p\ast q)(s)-u(s)\right|\leqslant\sum_s|p(s)-u(s)|\cdot\sum_s|q(s)-u(s)|$$ Assume that $|G|=n$ and recall that $u$ denotes the uniform distribution on $G$ hence $u(s)=n^{-1}$ for every $s$ in $G$. Furthermore, $$(p\ast q)(s)-n^{-1}=\left(\sum_tp(t)q(st^{-1})\right)-n^{-1}=\sum_t(p(t)-n^{-1})(q(st^{-1})-n^{-1})$$ where the second identity uses the fact that, for every $s$, $$\sum_tn^{-1}=\sum_tp(t)=\sum_tq(st^{-1})=1$$ Thus, the triangular inequality yields $$\sum_s\left|(p\ast q)(s)-n^{-1}\right|\leqslant\sum_s\sum_t|p(t)-n^{-1}|\cdot|q(st^{-1})-n^{-1}|$$ The double sum on the RHS is $$\sum_t|p(t)-n^{-1}|\cdot\sum_s|q(st^{-1})-n^{-1}|$$ and the proof is complete since, for every $t$, the map $s\mapsto st^{-1}$ is a bijection of $G$, hence $$\sum_s|q(st^{-1})-n^{-1}|=\sum_s|q(s)-n^{-1}|$$