What does "$dx$" or "$dy$" mean in Markov Kernel?

140 Views Asked by At

I am learning about Markov Chains in a general state space $\mathcal{X}$ and do not understand what exactly is meant by the $dy$ part in a Markov Kernel $P(x, dy)$?

I came across a definition for a reversible markov chain: It says a MC on $\mathcal{X}$ is reversible with respect to a prob. distribution $\pi(\cdot)$ on $\mathcal{X}$, if \begin{align*} \pi(dx) P(x, dy) = \pi(dy) P(y, dx)\end{align*} for all $x,y \in \mathcal{X}$.

According to the Definition of a (markov) Kernel $P: \mathcal{X} \times \mathcal{B}(\mathcal{X}) \to [0,1]$. I expect the second argument of $P$ to be a set, but I am not sure what rigorously is meant by the $dx$, if it is outside of an integral.

https://arxiv.org/pdf/math/0404033.pdf: Page 4, is where my confusion stems from for reference.

3

There are 3 best solutions below

3
On BEST ANSWER

The Markov kernel $P$ provides numbers $P(y,A)$ for each $y \in X$ and each $A \in \mathcal B(X)$. For a fixed $y$, the map $A \mapsto P(y,A)$ is a measure. And the notation $P(y,dx)$ is used to indicate integration with respect to that measure.

So I would interpret $\pi(dx) P(x, dy) = \pi(dy) P(y, dx)$ to mean: $$ \int \left(\int f(x,y) \;\pi(dx)\right) P(x, dy) = \int\left(\int f(x,y) \;\pi(dy)\right) P(y, dx) $$ for all measurable functions $f : X \times X \to \mathbb R$.

0
On

$\pi(dx) P(x, dy)$ can be interpreted as the probability measure $$ \mu(A,B) := \int\int 1_A(x) 1_B(y) \pi(dx)P(x, dy) \overset?= \int\int 1_A(x) 1_B(y) \pi(dy)P(y, dx) = \mu(B, A) $$ So I think reversibility tells you that this induced product measure is symmetric. At least that is what it looks like to me.

0
On

For a measure $\mu$, the notation $\mu(\mathrm{d}x)$ refers to nothing but the measure $\mu$ itself, with emphasis on the intuitive description that $\mu$ puts the weight $\mu(\Delta x)$ on any sufficiently small set $\Delta x$ about $x$.

Likewise, the expression $ \pi(\mathrm{d}x) P(x, \mathrm{d}y) $ refers to a measure on $\mathcal{X}\times\mathcal{X}$ that puts the weight $\pi(\Delta x)P(x, \Delta y) $ for any sufficiently small patch $\Delta x \times \Delta y$ about the point $(x, y)$.

This formulation can be made more rigorous by introducing the mechanism of "feeling the weights from place to place", just like a blind man trying to fathom the shape of an elephant by touching it here are there. Mathematically, this is done integrating $\mu$ against the test functions. In this regard, we have the following basic result in measure theory:

Theorem. Let $\mu$ and $\nu$ be measures on a measurable space $(X, \mathcal{F})$. Then the followings are equivalent:

  1. $\mu = \nu$. ($\mu$ and $\nu$ are equal.)
  2. $\int_X f(x) \, \mu(\mathrm{d}x) = \int_X f(x) \, \nu(\mathrm{d}x)$ for any bounded measurable function $f : X \to \mathbb{R}$. (Whenever you try to feel both of the measures in the same way, they feel just the same.)

So, by integrating out both sides of $\pi(\mathrm{d}x) P(x, \mathrm{d}y) = \pi(\mathrm{d}y) P(y, \mathrm{d}x)$ against a test function $f(x, y)$, we obtain an equivalent formulation of the equality as:

$$ \int \, \pi(\mathrm{d}x) P(x, \mathrm{d}y) \, f(x, y) = \int \pi(\mathrm{d}y) P(y, \mathrm{d}x) \, f(x, y) $$