Interpretation of condition probability of $X_0 = x$ for a Markov process and semigroup $(P_t)$

40 Views Asked by At

When studying Markov processes, I have seen a lot of authors define the semigroup as $P_tf(x) = \mathbb E_x(f(X_t))$ (with the assumption that $X_t$ is homogeneous) and the call $\mathbb E_x$ as the "expectation given $X_0=x$", i.e, they mean

$$\mathbb E_x(\cdot) = \mathbb E[\cdot|X_0=x],$$

and I couldn't find a rigorous definition of this because if $X_0$ is an absolute continuous variable then the right-hand side wouldn't work in the usual sense (dividing $\mathbb P(X_0=x)$). However, I also notice that some authors avoid defining this conditional law by starting out with "Markov kernels" associated with a Markov process $(X_t)$, which totally makes sense to me. I'm okay with the latter approach although there are things that I'm not fully understanding right now but I will reserve it for another post.

In addition, some even define $X=(X_t)_{t\geq 0}$ to be homegeneous iff for every bounded measurable set $\Gamma$ (in a metric space where $X_t$ takes value) we have

$$ \mathbb P(X_t \in \Gamma | X_s) = \mathbb P (X_{t+u} \in \Gamma|X_{s+u}), \quad \forall u >0. $$

First question: Without mentioning anything else, should I think interpret this equality as almost surely and the left handside is $\sigma(X_{s+u})$ measurable and is a version of $\mathbb P (X_{t+u} \in \Gamma|X_{s+u})$?

Second question: I'm looking for the rigorous definition of $\mathbb E[\cdot|X_0=x]$ above, I believe that it should take a deterministic value for $P_tf(x)$ to make sense.

Any rigorous reference related to this is highly appreciated. Thank you for your help!

1

There are 1 best solutions below

3
On

The conditional probabilities can be defined as conditional expectations of an indicator function: Assuming $\Gamma$ is a Borel measurable subset of $\mathbb{R}$ and $X_t, X_{t+u}, X_s, X_{s+u}$ are random variables, we define \begin{align} P[X_t \in \Gamma|X_s] &= E[1_{\{X_t\in \Gamma\}}|X_s]\\ P[X_{t+u}\in \Gamma|X_{s+u}] &= E[1_{\{X_{t+u}\in \Gamma\}}|X_{s+u}] \end{align}

Then, you are correct that the following equality does not make sense: $$ P[X_t \in \Gamma | X_s] = P[X_{t+u} \in \Gamma|X_{s+u}] \quad \forall u \geq 0 \quad (*) $$ That is because $P[X_t \in \Gamma | X_s]$ is a function of random variable $X_s$, while $P[X_{t+u}\in \Gamma|X_{s+u}]$ is a function of random variable $X_{s+u}$. If $u>0$ then, quite likely, these have different values and it does not make sense to claim any almost-sure equality.

A corrected statement of (*) is this: For any Borel measurable function $g:\mathbb{R}\rightarrow\mathbb{R}$ such that $g(X_s)$ is a version of $P[X_t\in \Gamma|X_s]$, we have for every $u\geq 0$ that $g(X_{s+u})$ is a version of $P[X_{t+u}\in \Gamma|X_{s+u}]$.


In general, if $Y$ is a random variable and $W$ is a random variable with $E[W^2]<\infty$, we can interpret $E[W|Y=y]$ as follows: Choose any Borel measurable $g:\mathbb{R}\rightarrow\mathbb{R}$ for which $g(Y)$ is a version of $E[W|Y]$ and then define: $$ E[W|Y=y] := g(y) \quad \forall y \in \mathbb{R}$$ This definition is not unique because we might have chosen some other Borel measurable function $\tilde{g}:\mathbb{R}\rightarrow\mathbb{R}$ such that $\tilde{g}(Y)$ is a version of $E[W|Y]$. We are not guaranteed that $g(y)=\tilde{g}(y)$ for all $y \in \mathbb{R}$. However we are guaranteed that when comparing these two versions, we have $$ P[g(Y)=\tilde{g}(Y)]=1$$ Or in other words if we define $$ A = \{y \in \mathbb{R}:g(y)\neq \tilde{g}(y)\}$$ Then $A$ is a Borel measurable subset of $\mathbb{R}$ and $\mu_{F_Y}(A)=0$, where $\mu_{F_Y}$ is the measure induced by the CDF of $Y$: $$ \mu_{F_Y}(D)=P[Y\in D]\quad \forall D \in \mathcal{B}(\mathbb{R})$$