$\mathbb{E}\left[\sum_{i=1}^{T_{A}}f(X_{i})\mid X_{1}=l\right]\overset{?}{=}\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\mid X_{0}=l\right]$

167 Views Asked by At

Let us consider a Markov chain $(X_{n})_{n\in\mathbb{N}}$(Note that $\mathbb{N}=\{0,1,2,...\}$) with state space $\mathcal{S},$ and let $A\subset\mathcal{S}$ denote a subset of $\mathcal{S}.$ The first time $T_{A}$ the chain hits the subset $A$, with $$T_{A}=\inf\{n\ge 0: X_{n}\in A\},$$ with $T_{A}=0$ if $X_{0}\in A$ and $$T_{A}=\infty\text{ if }\{n\le0:X_{n}\in A\}=\emptyset.$$ i.e. if $X_{n}\in A$ for all $n\in \mathbb{N}.$


To obtain a recurrence relation of an expectation as the form

$$g_{A}(k):=\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\mid X_{0}=k\right],$$ where $k\in\mathcal{S}\setminus A$ and $f(\cdot)$ is a bounded Borel function.

$$\begin{align} g_{A}(k)&=\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\mid X_{0}=k\right]\tag{1}\\ &=\sum_{l\in \mathcal{S}}\frac{1}{\mathbf{P}( X_{0}=k)}\cdot\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\cdot\mathbb{I}_{\{X_1=l\}}\cdot\mathbb{I}_{\{X_0=k\}}\right]\tag{2}\\ &=\sum_{l\in \mathcal{S}}\frac{\mathbf{P}(X_0=k,X_1=l)}{\mathbf{P}(X_0=k)}\cdot\mathbb{E}\left[f(k)+\sum_{i=1}^{T_{A}}f(X_{i})\mid X_{1}=l,X_{0}=k\right]\tag{3}\\ &=\sum_{l\in \mathcal{S}}p_{kl}\cdot f(k)+\sum_{l\in \mathcal{S}}p_{kl}\cdot\mathbb{E}\left[\sum_{i=1}^{T_{A}}f(X_{i})\mid X_{1}=l,X_{0}=k\right]\tag{4}\\ &=f(k)\cdot \sum_{l\in \mathcal{S}}p_{kl}+\sum_{l\in \mathcal{S}}p_{kl}\cdot{\color{Red} {\mathbb{E}\left[\sum_{i=1}^{T_{A}}f(X_{i})\mid X_{1}=l\right]}}\tag{5}\\ &=f(k)+\sum_{l\in \mathcal{S}}p_{kl}\cdot {\color{Red} {\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\mid X_{0}=l\right]}}\tag{6}\\ &=f(k)+\sum_{l\in \mathcal{S}}p_{kl}\cdot g_{A}(l)\tag{7} \end{align}$$ I don't know why $$\mathbb{E}\left[\sum_{i=1}^{T_{A}}f(X_{i})\mid X_{1}=l\right]=\mathbb{E}\left[\sum_{i=0}^{T_{A}}f(X_{i})\mid X_{0}=l\right].$$

2

There are 2 best solutions below

2
On

First, you can apply the simple Markov property to get the following equality $$\mathbb{E}\left[\sum_{i=1}^{T_A}f(X_i)\mid X_1=l;X_0=l \right]=\mathbb{E}\left[\sum_{i=1}^{T_A}f(X_i)\mid X_1=l \right].$$ Denote $\hat{X}_i=X_{i+1}$. We can remark that $\tilde{T}_A=\inf \{ n \geq 1 : X_n \in A\}=\inf \{ n \geq 0 : \hat{X}_n \in A\}$ is equal to $T_A=\inf \{ n \geq 0 : X_n \in A\}$ because $X_0 \notin A$.

Next, you can remark that the chain $(X_0,\dots,X_n)\mid X_0=l$ has the same law that $(\hat{X}_0,\dots,\hat{X}_{n})\mid \hat{X}_0=l$ because of the Markov property. Using the latest equality in law and our remark on $T_A$ and $\tilde{T}_A$, we get that
$$\mathbb{E}\left[\sum_{i=1}^{T_A}f(X_i)\mid X_1=l \right]=\mathbb{E}\left[\sum_{i=0}^{\tilde{T}_A}f(\hat{X}_i)\mid \hat{X}_0=l \right]=\mathbb{E}\left[\sum_{i=0}^{T_A}f(X_i)\mid X_0=l \right] $$ I hope it helps you.

2
On

It is easier to understand if we treat $T_A$ as a function of sample paths of $X$. Indeed, for each deterministic sequence $x = (x_n)_{n\in\mathbb{N}}$ in $\mathcal{S}^{\mathbb{N}}$, where $\mathbb{N} = \{0, 1, 2, \ldots\}$, we define $\mathbf{T}_A(x)$ by

$$ \mathbf{T}_A(x) = \inf\{n \geq 0 : x_n \in A\}. \tag{1} $$

Hence, $\mathbf{T}_A : \mathcal{S}^{\mathbb{N}} \to \mathbb{N} \cup\{\infty\}$ is a fixed, measurable function. Then we can realize $T_A$ as the composition of $\mathbf{T}_A$ and $X$, that is,

$$ T_A(\omega) = (\mathbf{T}_A \circ X)(\omega) = \mathbf{T}_A(X(\omega)) = \mathbf{T}_A\left( (X_n(\omega))_{n\in\mathbb{N}} \right). $$

One key property of $\mathbf{T}_A$ is that it satisfies the following recursive formula:

$$ \mathbf{T}_A((x_n)_{n\in\mathbb{N}}) = \begin{cases} 1 + \mathbf{T}_A((x_{n+1})_{n\in\mathbb{N}}), & x_0 \notin A \\ 0, & x_0 \in A \end{cases} \tag{2} $$

This is almost trivial from the definition $\text{(1)}$.


Now we turn to explaining the highlighted steps in OP's question. Let $\hat{X} = (\hat{X}_n)_{n\in\mathbb{N}}$ be defined by shifting $X$ by one unit of index, i.e., $\hat{X}_n = X_{n+1}$. If $X_0 \in \mathcal{S} \setminus A$, then by $\text{(2)}$ we get

$$ T_A = \mathbf{T}_A(X) = \mathbf{T}_A(\hat{X}) + 1. $$

Let $k \in \mathcal{S} \setminus A$. Using the above observation, we find that

\begin{align*} &\mathbb{E}\left[ \sum_{n=1}^{\mathbf{T}_A(X)} f(X_n) \,\middle|\, X_1 = l, X_0 = k \right] \\ &= \mathbb{E}\left[ \sum_{n=1}^{\mathbf{T}_A(\hat{X}) + 1} f(\hat{X}_{n-1}) \,\middle|\, \hat{X}_0 = l, X_0 = k \right] \tag{L1} \\ &= \mathbb{E}\left[ \sum_{n=1}^{\mathbf{T}_A(\hat{X}) + 1} f(\hat{X}_{n-1}) \,\middle|\, \hat{X}_0 = l \right] \tag{L2} \\ &= \mathbb{E}\left[ \sum_{n=0}^{\mathbf{T}_A(\hat{X})} f(\hat{X}_{n}) \,\middle|\, \hat{X}_0 = l \right] \tag{L3} \\ &= \mathbb{E}\left[ \sum_{n=0}^{\mathbf{T}_A(X)} f(X_{n}) \,\middle|\, X_0 = l \right] \tag{L4} \end{align*}

Here are some explanations:

  • $\text{(L1)}$ : Because $T_A = \mathbf{T}_A(\hat{X}) + 1$ given $X_0 = k$ and $X_n = \hat{X}_{n-1}$ for $n \geq 1$.

  • $\text{(L2)}$ : Because $(\hat{X} \mid \hat{X}_0 = l)$ is independent of $X_0$ by the Markov property.

  • $\text{(L3)}$ : Substituting $n \mapsto n+1$.

  • $\text{(L4)}$ : Because $(\hat{X} \mid \hat{X}_0 = l)$ has the same law as $(X \mid X_0 = l)$ by the Markov property.