Let
- $(E,\mathcal E,\mu)$ be a measure space
- $I$ be a set with $|I|\in\mathbb N$ and $\zeta$ denote the counting measure on $(I,2^I)$
- $w_t:E\to[0,1]$ be $\mathcal E$-measurable for $t\in I$ with $$\sum_{t\in I}w_t=1\tag1$$
- $\pi_t:\mathcal E\to[0,\infty)$ be $\mathcal E$-measurable with $$\int\pi_t\:{\rm d}\mu=1\tag2$$ for $t\in I$
- $Q_1$ be a Markov kernel on $(I,2^I)$ with $$Q_1(t,\;\cdot\;)=q_1(t,\;\cdot\;)\zeta\;\;\;\text{for all }t\in I\tag3$$ for some $q_1:I\times I\to[0,\infty)$
- $Q_2$ be a Markov kernel on $(E,\mathcal E)$ with $$Q_2(x,\;\cdot\;)=q_2(x,\;\cdot\;)\mu\;\;\;\text{for all }x\in E\tag4$$ for some $\mathcal E\otimes\mathcal E$-measurable $q_2:E\times E\to[0,\infty)$
Now, let $$\alpha_1(t,t',x):=1\wedge\frac{\pi(t',x)q_1(t',t)}{\pi(t,x)q_1(t,t')}\;\;\;\text{for }t,t'\in I\text{ and }x\in E$$ and $$\alpha_2(x,x',t'):=1\wedge\frac{\pi(t',x')q_2(x',x)}{\pi(t',x)q_2(x,x')}\;\;\;\text{for }x,x'\in E\text{ and }t\in I$$ where $$\pi(t,x):=w_t(x)\pi_t(x)\;\;\;\text{for }(t,x)\in I\times E.$$
The simulated tempering algorithm is a variant of the Metropolis-Hastings algorithm with the following update scheme: Let $(t,x)\in I\times E$:
- Let $t'\sim Q_1(t,\;\cdot\;)$ and $u\sim\mathcal U_{[0,\:1]}$
- If $u>\alpha_1(t,t',x)$, then reassign $t'=t$
- Let $x'\sim Q_2(x,\;\cdot\;)$ and $v\sim\mathcal U_{[0,\:1]}$
- If $v>\alpha_2(x,x',t')$, then reassign $x'=x$
- return $(t',x')$
The generated chain has a stationary distribution with denisty $\pi$ with respect to $\zeta\otimes\mu$.
Question: We could define a Markov kernel \begin{equation}\begin{split}Q\left(\left(t,x\right),C\right)&:=\int Q_1(t,{\rm d}t')\int Q_2(x,{\rm d}x')1_C(t',x')\\&=\int\underbrace{q_1(t,t')q_2(x,x')}_{=:\:q\left(\left(t,\:x\right),\:\left(t',\:x'\right)\right)}\left(\zeta\otimes\mu\right)\left({\rm d}\left(t',x'\right)\right)\end{split}\end{equation} on $(I\times E,2^I\otimes\mathcal E)$ and an acceptance probability $$\alpha\left(\left(t,x\right),\left(t',x'\right)\right):=1\wedge\frac{\pi(t',x')q\left(\left(t',x'\right),\left(t,x\right)\right)}{\pi(t,x)q\left(\left(t,x\right),\left(t',x'\right)\right)}.$$ How does the algorithm described above differ from the usual Metropolis-Hastings algorithm with proposal kernel $Q$ and acceptance probability $\alpha$?