Metropolis and limiting distribution

340 Views Asked by At

This question is about the advantages of ensuring a limiting distribution in the Metropolis algorithm.

According to Wikipedia ,

uniqueness of stationary distribution: the stationary distribution $\pi(x)$ must be unique. This is guaranteed by ergodicity of the Markov process, which requires that every state must (1) be aperiodic—the system does not return to the same state at fixed intervals; and (2) be positive recurrent—the expected number of steps for returning to the same state is finite.

In the article is stated that ergodic implies aperiodic. It seems that there is not agreenment about this, like in this SE question is shown.

According to this answer an unique stationary distribution exists if all states of an irreducible Markov chain are positive recurrent. In the same reference it is said that the limiting probabilities cannot converge.

I think that the requirement of aperiodicity is done to ensure that the chain is ergodic acording to this definition, which ensures a limiting distribution.


Question: In the application of the Metropolis algorithm one take the average value of some property of each state in the realization of the Markov chain. Why convergence to a limiting distribution would be an advantage? It seems to me that ensuring that the MC is irreducible and positive recurrent should be enough.

Any clarification of incorrect statements in this question is very welcome.

2

There are 2 best solutions below

5
On BEST ANSWER

Suppose $\{Z(t)\}_{t=0}^{\infty}$ is a discrete time Markov Chain (DTMC) with a finite or countably infinite state space $S$ and with transition probability matrix $P=(P_{ij})$.

We say that $Z(t)$ is irreducible if there is a finite path of nonzero probability from every state $i \in S$ to every other state $j \in S$.

A probability mass function (PMF) on the state space $S$ is a vector $(\pi_i)_{i \in S}$ such that $\pi_i \geq 0$ for all $i \in S$ and $\sum_{i\in S} \pi_i=1$.

Theorem: Suppose $\{Z(t)\}_{t=0}^{\infty}$ is an irreducible DTMC. If we can find a PMF $(\pi_i)_{i\in S}$ that satisfies the following stationary equations: $$ \pi_j = \sum_{i \in S} \pi_i P_{ij} \quad \forall j \in S $$ then $(\pi_i)_{i\in S}$ is the unique PMF that solves the above stationary equations, $\pi_i>0$ for all $i \in S$, and regardless of the initial condition $Z(0)$ we have for all $i \in S$: \begin{align} \lim_{T\rightarrow\infty} \frac{1}{T}\sum_{t=0}^{T-1} 1_{\{Z(t)=i\}} &= \pi_i \quad \mbox{(with prob 1)} \\ \lim_{T\rightarrow\infty} \frac{1}{T}\sum_{t=0}^{T-1}P[Z(t)=i] &= \pi_i \end{align} If there is no PMF that satisfies the stationary equations, then for all states $i \in S$ and regardless of the initial state $Z(0)$ we have: \begin{align} \lim_{T\rightarrow\infty} \frac{1}{T}\sum_{t=0}^{T-1} 1_{\{Z(t)=i\}} &= 0 \quad \mbox{(with prob 1)} \\ \lim_{T\rightarrow\infty} \frac{1}{T}\sum_{t=0}^{T-1}P[Z(t)=i] &= 0 \end{align}

2
On

I would answer your question as follows. For simplicity let us talk only about the finite state case.

1) If a Markov chain consists of one recurrent class and this class is periodic of period $d$, then the states in the class are partitioned into $d$ cyclically-ordered subsets, say $S_0$, $S_1$, ..., $S_{d-1}$, where each subset has transitions only into the next subset i.e. from $S_0$ to $S_1$ to $S_2$ to $S_3$ ... then back to $S_0$. Let the transition probability matrix of this chain be equal to $\mathbb{P}$.

2) The equation ${\vec x} = {\vec x} \mathbb{P}$ always has a probaiblity vector solution. Due to the conditions of 1), this vector, say $\vec p$, is unique.

3) Cosider an aperiodic ergodic single recurrent class Markov chain with transition probability matrix $\mathbb{P}^*$. Then it holds that $\lim_{n \rightarrow \infty} [\mathbb{P}^*]^n = {\vec 1} {\vec {p^*}}$. Here ${\vec {p^*}}$ is the solution of ${\vec {p^*}} = {\vec {p^*}} \mathbb{P}^*$ and $\vec 1$ is the vector of ones. Thus the limiting value of $[\mathbb{P}^*]^n$ is a finite matrix whose rows are all the same i.e. the limiting matrix is the product ${\vec 1} {\vec {p^*}}$.

4) In the case of 1) we have that $\lim_{n \rightarrow \infty} \mathbb{P}^n$ does not converge to anything. For example, $\mathbb{P}^d_{ij}=0$ for all $j$ not in the same periodic subset as $i$. In good ergodic chains, which are usually discussed, the memory of the state, where we started (state $i$), disappears with the growth of $n$. For periodic chains it is too much to hope for.

5) Even though $\lim_{n \rightarrow \infty} \mathbb{P}^n$ does not converge to anything, according to 2), $\vec p$ exists and is unique. The vector $\vec p$ can be called steady-state probability vector. But its interpretation is different from the interpretation used for ergodic Markov chains. Specifically, for periodic chains the components of $\vec p$ are the average values over one period. Strictly speaking, it holds that $$ \lim_{n \rightarrow \infty} {1 \over d} \left (\mathbb{P}^n + \mathbb{P}^{n+1} + \dots + \mathbb{P}^{n+d-1} \right ) ={\vec 1}{\vec p}. $$