I've seen several questions posted here, but they were not precisely what what I'm looking for, either in the answer which was given, or in the question itself.
1.
If $P_{x}=P_{\delta_{x}}$, then why $P_{\mu}(A)=\int P_{x}(A)\, \mu(dx)$ ?
I'm looking at this as if \begin{split} P_{\mu}(\prod ( X_i \in B_i)) & =\int \mu(dx) \int_{B_0} \delta_x(dx_0) \int_{B_1} p(x_0,dx_1) \cdots \int_{B_n} p(x_{n-1},dx_n) \\ & = \int \mu(dx) \mathbf{1}_{B_0}(x) \int_{B_1} p(x,dx_1) \cdots \int_{B_n} p(x_{n-1},dx_n) \\ & = \int_{B_0} \mu(dx) \int_{B_1} p(x,dx_1) \cdots \int_{B_n} p(x_{n-1},dx_n) \end{split}
2.
Let $Y:\Omega\rightarrow\mathbb{R}$ is measurable and bounded. Then $$E_{\mu}(Y\circ\theta_{n} \mid \mathcal{F}_{n})=E_{X_{n}}(Y)$$
So, according to a remark in the book, page 283, $E_{X_{n}}(Y)=g(X_n)$. The way I'm interpreting this is $Y\circ\theta_{n}=Y(X_{n+1},X_{n+2},...)$ and $E_{\mu}(Y(X_{n+1},X_{n+2},\ldots) \mid \mathcal{F}_{n})=g(X_n)$. However, I think I'm losing information. I think I should interpret it as $$E_{\mu}(Y(X_{n+1},X_{n+2},\ldots) \mid \mathcal{F}_{n})=E(Y(X_{n+1},X_{n+2},\ldots) \mid X_{n}).$$ How can I ?
- In wikipedia, the property is stated w.r.t. conditional probabilities. I get how from Durrett's definition we get the conditional probability definition using indicator functions. But how do we do the other direction? or are the definitions not equivalent?
Answer to Q1: Given the family $(P_x)_{x \in \mathbb{R}}$ of probability measures, we would like to find a probability measure $P_{\mu}$ such that $X_0$ has initial distribution $\mu$ with respect to $P_{\mu}$ and such that the Markov property $$E_{\mu}(Y \circ \theta_n \mid \mathcal{F}_n) = E_{X_n}(Y) \tag{1}$$ holds $P_{\mu}$-almost surely.
The natural candidate for this is
$$P_{\mu}(A) := \int P_x(A) \, \mu(dx).$$
Indeed: Since $P_{x}(X_0 \in \cdot) = \delta_x(\cdot)$ we have
$$P_{\mu}(X_0 \in A) = \int \underbrace{P_x(X_0 \in A)}_{\delta_x(A)} \, \mu(dx) = \int 1_A(x) \, \mu(dx) = \mu(A),$$
i.e. $X_0$ has distribution $\mu$ with respect to $P_{\mu}$. To prove $(1)$ we note that
$$E_x(Y \circ \theta_n \mid \mathcal{F}_n) = E_{X_n}(Y) \quad \text{$P_x$-a.s.}$$
gives
$$\int_F Y \circ \theta_n \, dP_x = \int_F E_{X_n}(Y) \, dP_x$$
for any $F \in \mathcal{F}_n$ and $x \in \mathbb{R}$. Integrating both sides with respect to $\mu(dx)$ we get
$$\int_F Y \circ \theta_n \underbrace{\int dP_x \, \mu(dx)}_{P_{\mu}(dx)} = \int_F E_{X_n}(Y) \underbrace{\int dP_x \, \mu(dx)}_{dP_{\mu}(dx)},$$
and this implies $(1)$.
Answer to Q2: The statement
$$E_{\mu}(Y(X_{n+1},\ldots) \mid \mathcal{F}_n) = g(X_n) \tag{2}$$
implies
$$E_{\mu}(Y(X_{n+1},\ldots) \mid \mathcal{F}_n) = E_{\mu}(Y(X_{n+1},\ldots) \mid X_n) \tag{3}.$$
Indeed: If we take on both sides of $(2)$ the conditional expectation with respect to $\sigma(X_n)$, the tower property gives
$$E_{\mu}(Y(X_{n+1},\ldots) \mid X_n) = g(X_n); \tag{4}$$
combining (2) and (4) we get immediately (3):
$$E_{\mu}(Y(X_{n+1},\ldots) \mid \mathcal{F}_n) \stackrel{(2)}{=} g(X_n) \stackrel{(4)}{=}E_{\mu}(Y(X_{n+1},\ldots) \mid X_n).$$
Answer to Q3: Yes, they are equivalent; this follows from a monotone class argument.