Motivation for Regular Conditional Distribution in Klenke's Book

109 Views Asked by At

I am reading Section 8.3 Regular Conditional Distribution of Klenke's Probability Theory. The author motivates the topic by saying that we would like to define for every $x\in E$ a probability measure $\mathbf{P}[\cdot|X=x]$ such that for any $A\in\mathcal{A}$, we have $\mathbf{P}[A|X]=\mathbf{P}[A|X=x]$ on $\{X=x\}$. To this end, the author defines the conditional expectation of $Y\in\mathcal{L}^1(\mathbf{P})$ given $X=x$ by $\mathbf{P}[Y|X=x]:=\varphi(x)$, where $\varphi$ satisfies $\mathbf{E}[Y|X]=\varphi(X)$. He also defines $\mathbb{P}[A|X=x]=\mathbb{E}[\mathbf{1}_{A}|X=x]$. He then goes on to state the following:

[...] for every given $A\in\mathcal{A}$, the expression $\mathbf{P}[A|X=x]$ is defined for almost all $x$ only; that is, up to $x$ in a null set that may, however, depend on $A$.

My question: I am having trouble understanding the meaning of this sentence. It seems to me that the author is picking a version $Z$ of $\mathbf{E}[Y|X]$, and then finding $\varphi$ such that $Z=\varphi(X)$ everywhere. But then the expression $\mathbf{P}[A|X=x]=\varphi(x)$ is defined for all $x$ in the range of $X$. What does the author mean by "for almost all $x$"? Is he speaking of the push-forward measure $\mathbf{P}\circ X^{-1}$ on $E$?

What I do understand is the following: The conditional probability $\mathbf{P}[A|X]$ is a random variable defined for almost all $\omega\in\Omega$ only, that is, up to $\omega$ in a null set that may depend on $A$. However, I am not sure how this is connected to the author's statement.

1

There are 1 best solutions below

0
On

As you wrote, (a version of) the conditional probability of $A ∈ \mathcal A$ given $X = x$ is defined as $ \mathbb P [A | X =x] = φ(x)$, where $φ: E → \mathbb R$ is a measurable map factorizing a version $Z$ of $ \mathbb P[A|σ(X)]$, i.e. it holds that $$Z(ω) = φ(X(ω)) \quad ∀ ω ∈ Ω.\tag{$*$}$$

so indeed we first pick a version of $ \mathbb P[A|σ(X)]$ and then factorize that. There are two sources of indeterminacy in this construction (which is why we talk about versions):

  1. The factorizing map $φ$ is determined uniquely by eq. $(*)$ on $X(Ω)$, but arbitrary (as long as it is measurable) on $X(Ω)^c$, i.e. any measurable $\tilde φ: E → \mathbb R$ agreeing with $φ$ on $X(Ω)$ also works as a version of $\mathbb P [A | X =x]$. If $X(Ω)$ is measurable in $(E,\mathcal E)$, then indeed $\mathbb P_X[X(Ω)] = \mathbb P[X ∈ X(Ω)]=1$, so we can say $\mathbb P [A | X =x]$ is defined uniquely up to the $\mathbb P_X$-negligible set $X(Ω)^c$, w.r.t. the version $Z$ of $ \mathbb P[A|σ(X)]$.
  2. Two versions of $\mathbb P[A|σ(X)]$ can disagree on a $\mathbb P$-negligible set. Say $Z$ is our version of $\mathbb P[A|σ(X)]$ from above, $N = X^{-1}(\tilde N) ∈ σ(X)$ is negligible and $Q$ is $σ(X)$-measurable, then $\tilde Z = Z I_{N^c} + Q I_N $ is a version of $\mathbb P[A|σ(X)]$. Now if $\tilde φ$ factorizes $\tilde Z$, then for all $x ∈ \tilde N ∩ X(Ω)$ there is a $ω_x ∈ \{ X= x\} ⊂ N $ and we have $$ \tilde φ(x) = \tilde φ(X(ω_x)) = \tilde Z(ω_x) = Q(ω_x).$$ So on top of the indeterminacy on $X(Ω)^c$, $\tilde φ$ can differ from $φ$ on $ \tilde N ∩ X(Ω)$. However since we assumed $N = X^{-1}(\tilde N ) $ to be negligible, this disagrement too is limited to a $\mathbb P_X$-negligible set.