Heuristic for rate of convergence of Markov chains

142 Views Asked by At

May I know why the spectral gap $:= 1 - \lambda_2$ is a heuristic for knowing the rate of convergence of Markov chains to a stationary distribution? ($1=\lambda_1 > \lambda_2>\cdots>\lambda_{|\chi|}$, where $\chi$ is the state space, are the eigenvalues of the transition matrix.) Wanted to get an intuition for why that would be the case.

1

There are 1 best solutions below

1
On BEST ANSWER

You should really be looking at the norms of the eigenvalues here. They can be negative or even complex.

Let’s assume for simplicity that the transition matrix is diagonalizable. Any state vector $\mathbf\pi$ can then be expressed as a linear combination $c_1\mathbf v_1+c_2\mathbf v_2+\cdots+c_{\lvert\chi\rvert}\mathbf v _{\lvert\chi\rvert} $ of eigenvectors. We then have $$\mathbf\pi P^n = \lambda_1^nc_1\mathbf v_1+\lambda_2^nc_2\mathbf v_2+\cdots+\lambda_{\lvert\chi\rvert}^n c_{\lvert\chi\rvert}\mathbf v _{\lvert\chi\rvert}.$$ If $\lvert\lambda_k\rvert\lt1$, then the corresponding term tends to zero as $n\to\infty$. The term with the largest eigenvalue with norm less than $1$ will vanish “more slowly” than the others, which makes it a good rough indicator of how quickly the process converges.

If there are complex eigenvalues with norm less than unity, they come in conjugate pairs and represent a scaled rotation in some plane. The scale factor is equal to the norm of the eigenvalue, so these scaled rotations also tend to zero in the long run.