I'm currently working through the proof of the Metropolis-Hastings algorithm, and using two sources:
I have a good understanding of most of the proof until where we prove that $\pi(y)$ is the invariant distribution of $P(x, dy)$ (at the bottom of this page).
The two parts I am unsure of are:
1) They seem to just get rid of the delta function $\delta_{x}(A)$, (which represents the point mass of x) on the second line of the proof below. Now I know the $\delta$ will take the form 1 if $x \in dy$ and 0 otherwise, but why does it simply go? (also the use of $(dy)$ without an integral does confuse me, they use dy for the definition ($\delta_{x}(dy)$), but then in the proof below they use a set 'A' of possible values ($\delta_{x}(A)$), which makes more sense than 'dy').
2) I know that an invariant distribution satisfies: $\pi(y)dy = \int_{A} P(x, dy)\pi(x)dx$, however the proof seems to prove $\int_{A} \pi(y)dy = \int P(x, dy)\pi(x)dx$
I would be really grateful if someone out there could please forward me to some more information (this would be brilliant so that I can reference). Or explain one or both of these 2 points to me. Thanks.
\begin{align*} \int P(x, A)\pi(x)dx &= \int \left[\int_{A}p(x, y)dy\right]\pi(x)dx + \int_{A}r(x)\delta_{x}(A)\pi(x)dx\\ &= \int_{A} \left[ \int p(x, y)\pi(x)dx\right]dy + \int_{A}r(x)\pi(x)dx\\ &= \int_{A} \left[ \int p(y, x)\pi(y)dx\right]dy + \int_{A}r(x)\pi(x)dx\\ %\text{We know that:}\\ r(y) = 1 - \int p(y, x)),\\ \text{so } \int p(y, x)dx = 1 - r(y)\\ &= \int_{A}(1 - r(y))\pi(y)dy + \int_{A}r(x)\pi(x)dx\\ &= \int_{A}\pi(y)dy - \int_{A}r(y)\pi(y) + \int_{A}r(x)\pi(x)dx\\ &= \int_{A}\pi(y)dy \end{align*}
The reason it is written as $\delta_x(dy)$ is not important, I think it is a serious abuse of notation, if it is not just plaine wrong. This is rather a bizarre convention, I always saw indicator function being used.
You do not need to worry about this technicality. As for your question, since you are integrating over $A$, on the set $A$, $\delta_x(A)=1$ on this set, so you can get rid of it, as it is always 1.
For 2, how can the invariant distribution satisfy $\pi(y)dy = \int_{A} P(x, dy)\pi(x)dx$ when the right hand side is $A$ dependent, whereas the left hand side is not? There are some measure theoratic restriction on what kind of sets $A$ can be.
The invariant distribution should satisfy the relationship you have given a proof for.
What you have proven is (1) in the Chib Greenberg paper, but for sets $A$ such that $\pi(A) = \int_{\mathbb{R}^d} P(x,A)\pi(x)dx$ for reasonable choice of $A$, i.e. measurable with respect to an appropriate measure. This means (1) holds for almost every $x$, but not necessarily every $x$. There are conditions you can place on these chains to make it true for every $x$, but I have not seen any proofs on these personally.