Markov chain and joint distribution

2.5k Views Asked by At

I have the following equation: Let $(X_t)_{t \in I}$ be a markov chain on a discrete space $E$ with any time horizon $I \subset \mathbb R$, which is not empty.

Then $\forall k \in \mathbb N$, $t_0 < ... <t_k \in I$, $a_0,...,a_k \in E$ with $P(X_{t_l}=a_l)>0$ $\forall l=0,...,k-1$:

$P(X_{t_0}=a_0,...,X_{t_k}=a_k)$ = $P(X_{t_0}=a_0) \cdot P(X_{t_1}=a_1\mid X_{t_0}=a_0) \cdot P(X_{t_2}=a_2\mid X_{t_0}=a_0, X_{t_1}=a_1)$ $\cdot \: ... \cdot$ $P(X_{t_k}=a_k \mid X_{t_0}=a_0,...,X_{t_{k-1}}=a_{k-1})$,

because of the law of total probability. This is equal to:

$P(X_{t_0}=a_0) \cdot P(X_{t_1}=a_1\mid X_{t_0}=a_0) \cdot P(X_{t_2}=a_2\mid X_{t_1}=a_1)$ $\cdot \: ... \cdot$ $P(X_{t_k}=a_k \mid X_{t_{k-1}}=a_{k-1})$,

because of the markov property.

I understand this equation, but here we have the assumption, that:

$P(X_{t_0}=a_0,...,X_{t_{k-1}}=a_{k-1}) > 0$. Now I have to show, that this equation is still true for $P(X_{t_0}=a_0,...,X_{t_{k-1}}=a_{k-1}) = 0$, which leads the equation to "$0=0$".

I understand, that in this situation the left side of the equation $P(X_{t_0}=a_0,...,X_{t_k}=a_k)$ is zero because this event is a subset of that one in the assumption.

But how do I explain, that

$P(X_{t_0}=a_0) \cdot P(X_{t_1}=a_1\mid X_{t_0}=a_0) \cdot P(X_{t_2}=a_2\mid X_{t_1}=a_1)$ $\cdot \: ... \cdot$ $P(X_{t_k}=a_k \mid X_{t_{k-1}}=a_{k-1})$ is zero, if $P(X_{t_0}=a_0,...,X_{t_{k-1}}=a_{k-1}) = 0$.

I don't think it is difficult, but I don't see it. I hope someone can help me.

1

There are 1 best solutions below

0
On BEST ANSWER

I think the problem is to understand the definition of conditional law:

Let $(\Omega,\mathcal{F},P)$ be a probability space. Let $X:\Omega\rightarrow\mathbb{R}^m$ and $Y:\Omega\rightarrow\mathbb{R}^n$ be two random vectors. The conditional law of $Y$ with respect to $X$ is defined as the transition probability $p:\mathbb{R}^m\times\mathcal{B}(\mathbb{R}^n)\rightarrow[0,1]$ satisfying $$P(X\in B,Y\in C)=\int_B p(x,C)\,P_X(dx).\quad (*) $$ If $q(x,C)$ is another transition probability satisfying $(*)$, then $q(x,C)=p(x,C)$ for all $C\in\mathcal{B}(\mathbb{R}^n)$ and $x\in N$, where $P_X(N)=0$.

The usual notation for the conditional law is $P(Y\in C|X=x)=p(x,C)$. This is well-defined even when $P(X=x)=0$.

From here we have the law of total probability when $X$ is discrete: $$P(X=j,Y\in C)=\int_{\{j\}}P(Y\in C|X=x)P_X(dx)=P(Y\in C|X=j)P(X=j). $$ This holds even when $P(X=j)=0$, but in such a case $P(Y\in C|X=j)$ is not uniquely defined.

In the definition of Markov chains, it is said $$P(X_{t_k}=a_k|X_{t_{k-1}}=a_{k-1},\ldots,X_{t_0}=a_0)=P(X_{t_k}=a_k|X_{t_{k-1}}=a_{k-1}). \quad (**)$$ By the definition of conditional law, both sides of $(**)$ make sense even when $P(X_{t_{k-1}}=a_{k-1},\ldots,X_{t_0}=a_0)=0$, although in such a case $P(X_{t_k}=a_k|X_{t_{k-1}}=a_{k-1},\ldots,X_{t_0}=a_0)$ is not unique, and you choose it so that it is equal to $P(X_{t_k}=a_k|X_{t_{k-1}}=a_{k-1})$.

As I showed, the law of total probability remains always true and your reasoning applies also when $P(X_{t_{k-1}}=a_{k-1},\ldots,X_{t_0}=a_0)=0$.