Why are these two definitions of Markov property equivalent?

970 Views Asked by At

Question

Suppose that $S$ is a finite or a countable subset of $\mathbb R$ and $(\xi_n)_{n\in\mathbb N}$ is an $S$-valued sequence of random variables. Then are these two definitions of Markov property equivalent?
1. For all $n\in\mathbb N$ and all $s\in S$, $P(\xi_{n+1}=s|\xi_0,\ldots,\xi_n)=P(\xi_{n+1}=s|\xi_n)$.
2. For all $n\in\mathbb N$ and for all $s_0,\ldots,s_{n+1}\in S$, $P(\xi_{n+1}=s_{n+1}|\xi_0=s_0,\ldots,\xi_n=s_n)=P(\xi_{n+1}=s_{n+1}|\xi_n=s_n)$.

Here, $P(A|\xi_0,\ldots,\xi_n):=P(A|\sigma(\xi_0,\ldots,\xi_n))=E(1_A|\sigma(\xi_0,\ldots,\xi_n))$ for all $A$ in the given $\sigma$-algebra of the probability space and all random variables $\xi_0,\ldots,\xi_n$.


How did I come to the question

I'm reading Brzezniak, & Zastawniak. "Basic Stochastic Processes." In the book it defines Markov chain as follows(Note that (5.10) is same as 1 in my question):

Definition Suppose that $S$ is a finite or a countable set. Suppose also that a probability space $(\Omega,\mathcal F,P)$ is given. An $S$-values sequence of random variables $\xi_n$, $n\in\mathbb N$, is called an $S$-valued Markov chain or a Markov chain on $S$ if for all $n\in\mathbb N$ and all $s\in S$ $$P(\xi_{n+1}=s|\xi_0,\ldots,\xi_n)=P(\xi_{n+1}=s|\xi_n).\tag{5.10}$$ Here $P(\xi_{n+1}=s|\xi_n)$ is the conditional probability of the event $\{\xi_{n+1}=s\}$ with respect to random variable $\xi_n$, or equivalently, with respect to the $\sigma$-field $\sigma(\xi_n)$ generated by $\xi_n$. Similarly, $P(\xi_{n+1}=s|\xi_0,\ldots,\xi_n)$ is the conditional probability of $\{\xi_{n+1}=s\}$ with respect to the $\sigma$-field $\sigma(\xi_0,\ldots,\xi_n)$ generated by the random variables $\xi_0,\ldots,\xi_n$.
Property (5.10) will usually be referred to as the Markov property of the Markov chain $\xi_n$, $n\in\mathbb N$. The set $S$ is called the state space and the elements of $S$ are called states.

But in the proof of the proposition that follows, it says:

...
A similar line of reasoning shows that $\xi_n$ is indeed a Markov chain. For this we need to verify that for any $n\in\mathbb N$ and any $s_0,s_1,\ldots,s_{n+1}\in S$ $$P(\xi_{n+1}=s_{n+1}|\xi_0=s_0,\ldots,\xi_n=s_n)=P(\xi_{n+1}=s_{n+1}|\xi_n=s_n).$$ ...

It's asserting that 2 in my question implies 1. It gives no proof for that. I'm pretty sure that 1 also implies 2 because if you see the 'definition' in this link: https://en.wikipedia.org/wiki/Markov_property
there is a formulation which is the same as 2.


My attempt

I really don't see a way to start.
For 1 $\to$ 2, I put $\zeta_0=P(\xi_{n+1}=s_{n+1}|\xi_0,\ldots,\xi_n)$, $\zeta_1=P(\xi_{n+1}=s_{n+1}|\xi_n)$. I aslo put $A=\{\xi_{0}=s_0\}\cap\cdots\cap\{\xi_n=s_n\}$, $B=\{\xi_n=s_n\}$. I found that $\int_A\zeta_0dP=\int_A\zeta_1dP=P(\xi_0=s_0,\ldots,\xi_{n+1}=s_{n+1})$ and $\int_B\zeta_0dP=\int_B\zeta_1dP=P(\xi_n=s_n,\xi_{n+1}=s_{n+1})$. But I don't know how to continue or if this even helps.

1

There are 1 best solutions below

0
On BEST ANSWER

No, the way you stated the definitions they are not equivalent. The reasons is that there might be null sets which cause problems. For instance if $\xi_n = \sum_{j=1}^n X_j$ is a simple random walk, then $(\xi_n)_{n \in \mathbb{N}}$ satisfies the first definition but not the second one; just note that $$\mathbb{P}(\xi_2 = 2 \mid \xi_0 = 5, \xi_1 = 1) = 0 \neq \mathbb{P}(\xi_2 = 2 \mid \xi_1=1)>0.$$

It is, however, possible to show the following statement:

Theorem Let $S$ be a countable set and $(\xi_n)_{n \in \mathbb{N}}$ a sequence of $S$-valued random variables. Then the following statements are equivalent:

  1. $\mathbb{P}(\xi_{n+1} = s \mid \xi_0,\ldots,\xi_n) = \mathbb{P}(\xi_{n+1}= s \mid \xi_n)$ for all $n \in \mathbb{N}$ and $s \in S$,
  2. If $n \in \mathbb{N}$ and $s_0,\ldots,s_n \in S$ are such that $$\mathbb{P}(\xi_0=s_0,\ldots,\xi_n=s_n)>0$$ then $$\mathbb{P}(\xi_{n+1} = s \mid \xi_0 = s_0,\ldots,\xi_n = s_n) = \mathbb{P}(\xi_{n+1} = s \mid \xi_n = s_n) \quad \text{for all $s \in S$.}$$

For the proof we will use the following auxiliary statement:

Lemma Let $U$ be a countable set and $Y$ an $U$-valued random variable. Then $$\mathbb{P}(1_A \mid Y) = g(Y) \quad \text{a.s.} \tag{1}$$for $$g(y) := \mathbb{P}(A \mid Y=y) ´, \qquad y \in U. \tag{2}$$

In $(2)$, $\mathbb{P}(A \mid Y=y)$ denotes the classical conditional probability, i.e. $$\mathbb{P}(A \mid Y=y) = \begin{cases} \frac{\mathbb{P}(A \cap \{Y=y\})}{\mathbb{P}(Y=y)}, & \mathbb{P}(Y=y)>0, \\ 0, & \text{otherwise} \end{cases}. $$

Proof of the lemma: The random variable $g(Y)$ is clearly $\sigma(Y)$-measurable, to prove $(1)$ it therefore remains to show that $$\int_F 1_A \, d\mathbb{P} = \int_F g(Y) \, d\mathbb{P} \tag{3}$$ for all $F \in \sigma(Y)$. Since the $\sigma$-algebra $\sigma(Y)$ is generated by sets of the form $\{Y=u\}$, $u \in U$, it suffices to check $(3)$ for $F=\{Y=u\}$ with $u \in U$ fixed. We consider two cases separately:

  • If $u$ is such that $\mathbb{P}(Y=u)=0$, i.e. $\mathbb{P}(F)=0$, then clearly both sides of $(3)$ equal to $0$.
  • If $u$ is such that $\mathbb{P}(Y=u)>0$, then $$\int_F g(Y) \, d\mathbb{P} = \int_{\{Y=u\}} \underbrace{g(Y)}_{g(u)} \, d\mathbb{P} = g(u) \mathbb{P}(Y=u).$$ By the definition of $g$, we have $g(u) = \mathbb{P}(A \mid Y=u) = \mathbb{P}(A \cap \{Y=u\})/\mathbb{P}(Y=u)$, and so $$\int_F g(Y) \, d\mathbb{P} = \mathbb{P}(A \cap \{Y=u\}) = \int_{\{Y=u\}} 1_A \, d\mathbb{P} = \int_F 1_A \, d\mathbb{P}.$$

Proof of the theorem: Fix $s \in S$ and $n \in \mathbb{N}$. By the above lemma, we have

$$\mathbb{P}(\xi_{n+1} = s \mid \xi_0,\ldots,\xi_n) = g(\xi_0,\ldots,\xi_n) \quad \text{and} \quad \mathbb{P}(\xi_{n+1} = s \mid \xi_n) = h(\xi_n) \tag{4}$$ where $$\begin{align*} g(y_0,\ldots,y_n) &:= \mathbb{P}(\xi_{n+1} = s \mid \xi_0=y_0,\ldots,\xi_n=y_n)\\ h(y_n) &:= \mathbb{P}(\xi_{n+1} = s \mid \xi_n = y_n) . \end{align*}$$

$1. \implies 2.$: Let $s_0,\ldots,s_n$ be such that $F:=\{\xi_0=s_0,\ldots,\xi_n = s_n\}$ satisfies $\mathbb{P}(F)>0$. By assumption and $(4)$, we have $$g(\xi_1,\ldots,\xi_n) = \mathbb{P}(\xi_{n+1} =s \mid \xi_0,\ldots,\xi_n) = \mathbb{P}(\xi_{n+1} = s \mid \xi_n) = h(\xi_n).$$ In particula, $$1_{\{\xi_0=s_0,\ldots,\xi_n = s_n\}} g(\xi_1,\ldots,\xi_n) = 1_{\{\xi_0=s_0,\ldots,\xi_n = s_n\}} h(\xi_n),$$ i.e. $$1_{\{\xi_0=s_0,\ldots,\xi_n = s_n\}} g(s_0,\ldots,s_n) = 1_{\{\xi_0=s_0,\ldots,\xi_n = s_n\}} h(s_n).$$ Since $\mathbb{P}(\xi_0=s_0,\ldots,\xi_n=s_n)>0$, this is equivalent to saying that $$g(s_0,\ldots,s_n) = h(s_n);$$ by the very definition of $g$ and $h$ this yields $$\mathbb{P}(\xi_{n+1} = s \mid \xi_0=s_0,\ldots,\xi_n=s_n) = \mathbb{P}(\xi_{n+1} = s \mid \xi_n = s_n).$$

$2. \implies 1.$: If 2. holds then it follows from the very definition of $g$ and $h$ that $$g(y_0,\ldots,y_n) = h(y_n)$$ for all $y_0,\ldots,y_n \in S$ such that $\mathbb{P}(\xi_0=y_0,\ldots,\xi_n=y_n)>0$. Hence, $$g(\xi_0,\ldots,\xi_n) = h(\xi_n) \quad \text{a.s.}$$ which implies by $(4)$ that $$\mathbb{P}(\xi_{n+1} = s \mid \xi_0,\ldots,\xi_n) = \mathbb{P}(\xi_{n+1}= s \mid \xi_n) \quad \text{a.s.}$$