First step analysis on mean time to absorption of Markov chains

71 Views Asked by At

From "An introduction to Stochastic Modeling" by Pinsky and Karlin:

Let $T = \min \{n \ge 0 : X_n \ge r\}$ where $\{X_n\}$ is a Markov chain with transient states $0, 1, \dots, r-1$ and absorbing states $r, r+1, \dots, N$. Define $w_i = E[\sum_{n=0}^{T-1}g(X_n) | X_0 = i]$, where $g$ is some function. Then $$w_i = g(i) + \sum_{j=0}^{r-1}P_{ij}w_j, \text{ for } i = 0, \dots, r-1.$$

I have no idea how to show this axiomatically. I start with conditioning on the first step.

$$E[\sum_{n=0}^{T-1}g(X_n) | X_0 = i] = \sum_{n=0}^{N} E[\sum_{k=0}^{T-1}g(X_n) | X_1 = k, X_0 = i]P_{ik}$$

Then, for $k$, a transient state: $$E[\sum_{k=0}^{T-1}g(X_n) | X_1 = k, X_0 = i] =$$ $$ \sum_{x_2, x_3, \dots, x_{T-1}} (g(i) + g(k) + g(x_2) + \dots + g(x_{T-1}))P(X_{T-1} = x_{T-1}, \dots, X_2 = x_2 | X_1 = k, X_0 = i)$$

The Markov property implies:

$$= \sum_{x_2, x_3, \dots, x_{T-1}} (g(i) + g(k) + g(x_2) + \dots + g(x_{T-1}))P(X_{T-1} = x_{T-1}, \dots, X_2 = x_2 | X_1 = k)$$

But I can't figure out an axiomatic way to show that this equals

$$= \sum_{x_2, x_3, \dots, x_{T-1}} (g(i) + g(k) + g(x_2) + \dots + g(x_{T-1}))P(X_{T-1} = x_{T-1}, \dots, X_1 = x_2 | X_0 = k).$$

It seems to make sense intuitively since being at state $k$ and moving to an absorption state is the same as starting from $k$, but I can't derive this from the axioms of probability and the properties of Markov chains.

Anyone have any ideas?

1

There are 1 best solutions below

8
On BEST ANSWER

What you need to do after conditioning on the first step is to split the sum, so as to "peel off" the $g(i)$ term. This looks like:

$$E \left [ \sum_{n=0}^{T-1} g(X_n) \mid X_0=i \right ] = \sum_{j=0}^N P_{ij} E \left [ \sum_{n=0}^{T-1} g(X_n) \mid X_0=i,X_1=j \right ] \\ = g(i) + \sum_{j=0}^N P_{ij} E \left [ \sum_{n=1}^{T-1} g(X_n) \mid X_0=i,X_1=j \right ]$$

since $\sum P_{ij}=1$. Now the Markov property lets you throw away the condition $X_0=i$, because $X_1,\dots,X_{T-1}$ and $T$ are conditionally independent of $X_0$ given $X_1$. The last step is to use time homogeneity to show that $E \left [ \sum_{n=1}^{T-1} g(X_n) \mid X_1=j \right ]=w_j$. This is intuitively kind of obvious, as you are just identifying time $1$ as the initial time and otherwise doing the same thing, but since the definition of time homogeneity only involves deterministic times, you have some mess with conditioning on $T$ to finish writing it out.