Equivalent definitions of a Markov chain: how Definition 1 implies Definition 2?

144 Views Asked by At

I'm reading about two equivalent definitions of a Markov chain from this page.


As usual, our starting point is a probability space $(\Omega, \mathscr{F}, \mathbb{P})$, so $\Omega$ is the sample space, $\mathscr{F}$ the $\sigma$-algebra of events, and $\mathbb{P}$ the probability measure on $(\Omega, \mathscr{F})$. Let $\boldsymbol{X}=\left(X_0, X_1, X_2, \ldots\right)$ be a stochastic process defined on the probability space, with time space $\mathbb{N}$ and with countable state space $S$. In the context of the general introduction, $S$ is given the power set $\mathscr{P}(S)$ as the $\sigma$-algebra.

The vector space $\mathscr{B}$ consisting of bounded functions $f: S \rightarrow \mathbb{R}$ will play an important role. The norm that we use is the supremum norm defined by $$ \|f\|=\sup \{|f(x)|: x \in S\}, \quad f \in \mathscr{B}. $$

For $n \in \mathbb{N}$, let $\mathscr{F}_n=\sigma\left\{X_0, X_1, \ldots, X_n\right\}$, the $\sigma$-algebra generated by the process up to time $n$. Thus $\mathfrak{F}=\left\{\mathscr{F}_0, \mathscr{F}_1, \mathscr{F}_2, \ldots\right\}$ is the natural filtration associated with $\boldsymbol{X}$.

  • Definition 1 $\boldsymbol{X}=\left(X_0, X_1, X_2, \ldots\right)$ is a Markov chain if either of the following equivalent conditions is satisfied: a. $\mathbb{P}\left(X_{n+1}=x | \mathscr{F}_n\right)=\mathbb{P}\left(X_{n+1}=x | X_n\right)$ a.s. for every $n \in \mathbb{N}$ and $x \in S$. b. $\mathbb{E}\left[f\left(X_{n+1}\right) | \mathscr{F}_n\right]=\mathbb{E}\left[f\left(X_{n+1}\right) | X_n\right]$ a.s. for every $n \in \mathbb{N}$ and $f \in \mathscr{B}$.

  • Definition 2 $\boldsymbol{X}=\left(X_0, X_1, X_2, \ldots\right)$ is a Markov chain if for every $n \in \mathbb{N}$ and every sequence of states $\left(x, x_n, \ldots, x_0\right)$ such that $\mathbb P (X_n=x_n , \ldots, X_0=x_0) > 0$, we have $$ \begin{align*} &\mathbb{P}\left(X_{n+1}=x | X_n=x_n , \ldots, X_0=x_0\right) \\ = &\mathbb{P}\left(X_{n+1}=x | X_n=x_n\right). \end{align*} $$


In below failed attempt, I'm stuck at showing $\frac{\mathbb E [Z_2 1_{B}]}{\mathbb P(B)} = \mathbb P (B_{n+1}|B_n)$. Could you elaborate on how to finish the proof?


My attempt

  1. Definition 1 implies Definition 2.

Let $B_{n+1} = \{X_{n+1} = x\}$. Let $B_i := \{X_i=x_i\}$ for all $i=0, \ldots,n$ and $C := \bigcap_{i=0}^n B_i$. We want to prove $$ \mathbb P (B_{n+1}|C) = \mathbb P (B_{n+1}|B_n). $$

Let $$ \begin{align} Z_1 &:= \mathbb{P}\left(X_{n+1}=x | \mathscr{F}_n\right) = \mathbb E[1_{B_{n+1}}| \mathscr{F}_n], \\ Z_2 &:=\mathbb{P}\left(X_{n+1}=x | X_n\right) = \mathbb E[1_{B_{n+1}}| \sigma(X_n)]. \end{align} $$

By Definition 1, $Z_1 = Z_2$ a.s. Notice that $C\in \mathscr{F}_n$. By definition of conditional expectation, $$ \begin{align} \mathbb E [Z_1 1_{C}] &= \mathbb E [1_{B_{n+1}} 1_C] = \mathbb P(B_{n+1} \cap C) \\ &= \mathbb P (B_{n+1}|C) \mathbb P(C). \end{align} $$

It suffices to prove that $\frac{\mathbb E [Z_1 1_{C}]}{\mathbb P(C)} = \mathbb P (B_{n+1}|B_n)$, or equivalently $\frac{\mathbb E [Z_2 1_{C}]}{\mathbb P(C)} = \mathbb P (B_{n+1}|B_n)$.

1

There are 1 best solutions below

0
On BEST ANSWER

I have found an answer from this thread by @saz. For the sake of my own understanding, I re-present his/her ideas below.


For $A,B \in \mathscr{F}$, we define $\mathbb P (A|B) := \frac{\mathbb P(A \cap B)}{\mathbb P (B)}$ if $\mathbb P (B)>0$ and $0$ otherwise.

Lemma Let $S$ be countable, $A \in \mathscr{F}$, and $Y:\Omega \to S$ a random variable. Then $\mathbb P (A | Y) := \mathbb E[1_A | \sigma(Y)] = \psi(Y)$ a.s. where $\psi:S \to \mathbb R$ is a measurable function defined by $\psi (y) := \mathbb P (A | Y=y) := \mathbb P (A | \{Y=y\})$ for all $y \in S$.

Proof Clearly, $\psi$ is measurable. So $\psi(Y)$ is $\sigma(Y)$-measurable. It remains to prove that $\mathbb E[\psi(Y) 1_B] = \mathbb E[1_A 1_B]$ for all $B \in \sigma(Y)$. We have $B = \{Y=y\}$ for some $y \in S$. If $\mathbb P(B)=0$ we are done. Let $\mathbb P(B)>0$. We have $$ \mathbb E[\psi(Y) 1_B] = \mathbb E[\psi(y) 1_{\{Y=y\}}] = \psi(y) \mathbb P(Y=y)= \mathbb P (A \cap \{Y=y\}). $$ On the other hand, $\mathbb E[1_A 1_B] = \mathbb P(A\cap B) = \mathbb P (A \cap \{Y=y\})$. This completes the proof.

Fix $x \in S$ and $n \in \mathbb N$. By our Lemma, $$ \begin{align} \mathbb{P} (X_{n+1}=x | \mathscr{F}_n) &= \psi(X_0, \ldots, X_n) ,\\ \mathbb{P} (X_{n+1}=x | X_n) &= \varphi(X_n), \end{align} $$ for some measurable functions $\psi:S^{n+1} \to \mathbb R$ and $\varphi:S \to \mathbb R$ such that $$ \begin{align} \psi(x_0, \ldots, x_n) &= \mathbb P(X_{n+1}=x | X_n=x_n, \ldots, X_0=x_0), \\ \varphi(x_n) &= \mathbb P(X_{n+1}=x | X_n=x_n), \end{align} $$ for all $x_0, \ldots, x_n \in S$.

  1. Definition 1 implies Definition 2.

Let $\mathbb P (X_n=x_n , \ldots, X_0=x_0) > 0$. By Definition 1, $\psi(X_0, \ldots, X_n) = \varphi(X_n)$ a.s. and thus $$ 1_{\{X_n=x_n , \ldots, X_0=x_0\}} \psi(X_0, \ldots, X_n) = 1_{\{X_n=x_n , \ldots, X_0=x_0\}}\varphi(X_n) \quad \text{a.s}. $$

This implies $$ \mathbb E[1_{\{X_n=x_n , \ldots, X_0=x_0\}} \psi(X_0, \ldots, X_n)] = \mathbb E[1_{\{X_n=x_n , \ldots, X_0=x_0\}}\varphi(X_n)]. $$

So $$ \mathbb P (X_n=x_n , \ldots, X_0=x_0) \psi(x_0, \ldots, x_n) = \mathbb P (X_n=x_n , \ldots, X_0=x_0) \varphi(x_n). $$

Because $\mathbb P (X_n=x_n , \ldots, X_0=x_0)>0$, we get $\psi(x_0, \ldots, x_n) = \varphi(x_n)$.

  1. Definition 2 implies Definition 1.

We want to prove $\psi(X_0, \ldots, X_n) = \varphi(X_n)$ a.s. It suffices to prove that $\psi(X_0, \ldots, X_n) = \varphi(X_n)$ outside a null set. It suffices to prove $\psi(x_0, \ldots, x_n) = \varphi(x_n)$ for all $x_0, \ldots, x_n \in S$ such that $\mathbb P (X_n=x_n , \ldots, X_0=x_0) > 0$. However, this trivially follows from Definition 2.