Let
- $(\Omega,\mathcal A,\operatorname P)$ be a probability space
- $(E,\mathcal E)$ be a measurable space
- $Q$ be a Markov kernel with source and target $(E,\mathcal E)$
- $x_0\in E$
- $\alpha:E\times E\to[0,1]$
- $\delta_x$ denote the Dirac measure at $x\in E$ on $(E,\mathcal E)$
The Metropolis-Hastings algorithm works as follows:
- Given $n\in\mathbb N_0$, let $y_{n+1}$ be a sample from $Q(x_n,\;\cdot\;)$
- With probability $\alpha(x_n,y_{n+1})$, set $$x_{n+1}:=y_{n+1};$$ otherwise set $$x_{n+1}:=x_n$$
At the end of the algorithm, we have obtained a sample path $(x_n)_{n\in\mathbb N_0}$ of a Markov chain $(X_n)_{n\in\mathbb N_0}$. Let $$R(x):=1-\int α(x,y)\:Q(x,{\rm d}y)\;\;\;\text{for }x∈E$$ and $$κ(x,B):=\int_Bα(x,y)\:Q(x,{\rm d}y)+R(x)δ_x(B)\;\;\;\text{for }(x,B)∈E×\mathcal E.$$ Then, it's easy to see that $$\operatorname P\left[X_{n+1}\in B\mid X_n=x\right]=\kappa(x,B)\;\;\;\text{for all }(x,B)\in E\times\mathcal E\tag1$$ for all $n\in\mathbb N$.
How can we show that we even got $$\operatorname P\left[X_{n+1}\in B\mid\mathcal F_n\right]=\kappa(X_n,B)\;\;\;\text{almost surely for all }B\in\mathcal E,\tag2$$ where $\mathcal F_n:=\sigma(X_0,\ldots,X_n)$, for all $n\in\mathbb N_0$?
I was actually able to show that $$\operatorname P\left[X_{n+1}\in B\mid X_n\right]=\kappa(X_n,B)\;\;\;\text{almost surely for all }B\in\mathcal E\tag3$$ for all $n\in\mathbb N_0$. Intuitively, it's clear to me that $\operatorname P\left[X_{n+1}\in B\mid\mathcal F_n\right]=\operatorname P\left[X_{n+1}\in B\mid X_n\right]$, but how do we need to argue rigorously?