Markov Chain hitting 2 states one after another - expected time?

97 Views Asked by At

Suppose we are given an ergodic Markov Chain $(X_n)_{n\in\mathbb{N}_0}$ (finite state space $E$). By "ergodic" i mean that one can reach any state from any other state in finitely steps with non-zero probability.

Say $X_0=s_0$ almost surely. Let $s,s'\in E$ with $\mathbb{P}(X_{n+1}=s'\mid X_n=s)>0$ for all $n\in\mathbb{N}_0$. I am interested in the expectation of $\tau_{s,s'}:=\inf\{n\in\mathbb{N}_0\mid X_n=s,X_{n+1}=s'\}$. I am familiar with the system of equations for the case, when only one state is hit, see e.g. Grinstead in Section 11.5.

How does one tackle $\mathbb{E}[\tau_{s,s'}]$? More specifically, I would like to find some sort of lower bound on this expectation that behaves somewhat like $1/\mathbb{P}(X_{n+1}=s'\mid X_n=s)$ (my intuition tells me that this should be a lower bound, but certainly not the sharpest). Any reference or help is appreciated.

2

There are 2 best solutions below

2
On

Define $Y_0 = (X_1,X_0), Y_1 = (X_2,X_1)...$ etc. Then $Y_n$ is a Markov process and $$\tau_{s,s'} = \inf\{n: Y_n = (s,s')\} = \tau^Y_{(s,s')}.$$ That is, you can use the same theory, with the addition that you might want to average over the state $X_1$, i.e. $$\mathbb{E}[\tau_{s,s'} \vert X_0 = s_0] = \mathbb{E}[\tau^Y_{(s,s')} \vert X_1, X_0 = s_0]$$

Edit A partial answer on your bound:

Let $\pi$ be the stationary distribution of the chain $X$. The stationary distribution of $Y$ is then $\tilde{\pi}(s,s') = p(s'\vert s) \pi(s)$ where $p$ is the transition probabilities of $X$.

We have the following identity (cf. your reference) : Let $R_{(s,s')}$ denote the first recurrence time to state $(s,s')$ when starting in $(s,s')$. Then

$$\frac{1}{\tilde{\pi}(s,s')} = \mathbb{E}[R(s,s')] = 1 + \sum_{(s'',s''')} \mathbb{P}[Y_1 = (s'',s''') \vert Y_0 = (s,s')] \mathbb{E}[\tau_{(s,s')} \vert Y_0 = (s'',s''')].$$

Now we can only transition from $(s,s')$ to states that have first entry $s'$, hence the right side is $$\sum_{s'''} \mathbb{P}[X_1 = s''' \vert X_0 = s'] \mathbb{E}[\tau_{(s,s')} \vert Y_0 = (s',s''')] = \mathbb{E}[\tau_{(s,s')} \vert X_0 = s'].$$

This gives you then that $$ \frac{1}{p(s' \vert s)\pi(s)} - 1 = \mathbb{E}[\tau_{(s,s')} \vert X_0 = s']$$

0
On

$\def\ed{\stackrel{\text{def}}{=}}$ By adding just one extra absorbing state to your chain you can reduce your problem to that of finding the expected time to hit that single absorbing state. I'll assume here that your chain $\ \big(X_n\big)_{n\in\mathbb{N}_0}\ $ is time-homogeneous (which is normally required as part of the definition of ergodicity) and has only a finite-number of states. In your cited reference, Grinstead and Snell's Introduction to Probability, chains with these properties are the only ones considered.

Let $\ r\ $ be the number of states of the chain, and $\ \Pi\ $ its $\ r\times r\ $ row-stochastic transition matrix. That is, the entry in the $\ i^\text{th}\ $ row and $\ j^\text{th}\ $ column of $\ \Pi\ $ is $\ \pi_{ij}\ed$$\,\mathbb{P}\big(X_{n+1}=j\ \big|\,X_n=i \big)\ .$ Let $\ Q\ $ be the $\ r\times r\ $ matrix whose entry in its $\ i^\text{th}\ $ row and $\ j^\text{th}\ $ column is $\ \pi_{ij}\ $ if $\ i\ne s\ $ and $\ j\ne s'\ $, or $\ 0\ $ if $\ i=s\ $ and $\ j=s'\ .$ Then $$ Q=P-\pi_{ss'}\mathbf{d}_s\mathbf{d}_{s'}^T\ \, $$ where $\ \mathbf{d}_\sigma\ $ is the $\ r\times1\ $ column vector whose $\ \sigma^\text{th}\ $ entry is $\ 1\ $ and all of whose other entries are zero.

Let $$ Z_n=\cases{X_n&if $\ n\le \tau_{ss'}$\\ r+1&if $\ \tau_{ss'}\le n-1\ $.} $$ The process $\ \big(Z_n\big)_{n\in\mathbb{N}_0}\ $ is a Markov chain with an absorbing state, $\ r+1\ ,$ and $\ (r+1)\times(r+1)\ $ transition matrix $\ P\ $ given by $$ P=\pmatrix{Q&\pi_{ss'}\mathbf{d}_s\\0_{1\times r}&1} , $$ and $\ Z_n\ $ coincides with $\ X_n\ $ up to and including the time $\ \tau_{ss'}\ ,$ but moves to the absorbing state $\ r+1\ $ at time $\ \tau_{ss'}+1\ $ and remains in that state thereafter. Thus, $\ \tau_{ss'}+1\ ,$ is the first passage time of $\ \big(Z_n\big)_{n\in\mathbb{N}_0}\ $ from the state $\ s_0\ $ to the state $\ r+1\ $. Pages $416$$19$ of your cited reference explains how to calculate $\ \mathbb{E}\big(\tau_{ss'}+1\big)\ $. Theorem $\mathbf{11.5}\ $ tells us that if $\ t_{s_0}\ed \mathbb{E}\big(\tau_{ss'}+1\big)\ $, $\ \mathbf{t}\ $ is the $\ r\times1\ $ column vector whose $\ \sigma^\text{th}\ $ entry is $\ t_\sigma\ $ and $\ \mathbf{1}\ $ the the $\ r\times1\ $ column vector all of whose entries are $1$, then $\ \mathbf{t}\ $ satisfies the equation \begin{align} \mathbf{1}&=\big(I_{r\times r}-Q\big)\mathbf{t}\\ &=\big(I_{r\times r}-\Pi+\pi_{ss'}\mathbf{d}_s\mathbf{d}_{s'}^T\big)\mathbf{t}\ ,\label{eq1}\tag{1} \end{align} From which you can derive a couple of lower bounds for $\ t_{s_0}\ .$ Multiplying the equation on the left by $\ \mathbf{d}_\sigma^T\ $ with $\ \sigma\ne s\ $ gives \begin{align} \mathbf{d}_\sigma^T\mathbf{1}&=1\\ &=t_\sigma-\sum_{j=1}^r\pi_{\sigma j}t_j\\ &=(1-\pi_{\sigma\sigma})t_\sigma-\sum_{j=1\\j\ne\sigma}^r\pi_{\sigma j}t_j\\ &\le(1-\pi_{\sigma\sigma})t_\sigma\ , \end{align} from which you get the lower bound \begin{align} \mathbb{E}\big(\tau_{ss'}\big)&=t_{s_0}-1\\ &\ge \frac{1}{1-\pi_{s_0s_0}}-1\\ &=\frac{\pi_{s_0s_0}}{1-\pi_{s_0s_0}}\\ &=\frac{\mathbb{P}\big(\, X_{n+1}=s_0\,\big|\,X_n=s_0\big)}{\mathbb{P}\big(\, X_{n+1}\ne s_0\,\big|\,X_n= s_0\big)}\label{eq2}\tag{2} \end{align} for $\ s_0\ne s\ ,$ curiously independent of $\ s\ $ and $\ s'\ .$

Multiplying equation (\ref{eq1}) on the left by $\ \mathbf{d}_s^T\ $ gives \begin{align} \mathbf{d}_s^T\mathbf{1}&=1\\ &=t_s-\sum_{j=1}^r\pi_{sj}t_j+\pi_{ss'}t_{s'}\\ &=(1-\pi_{ss})t_s-\sum_{\ \ j=1\\\ j\ne s,s'}^r\pi_{sj}t_j\\ &\le(1-\pi_{ss})t_s\ , \end{align} provided $\ s\ne s'\ $. Thus, the same lower bound (\ref{eq2}) holds when $\ s_0=s\ne$$\,s'\ .$

If $\ \mu_\sigma\ed\lim_\limits{n\rightarrow\infty}\mathbb{P}\big(X_n=\sigma\big)\ $ is the long-run average probability that the chain $\ \big(X_n\big)_{n\in\mathbb{N}_0}\ $ is in state $\ \sigma\ ,$ and $\ \mu\ $ the $\ r\times1\ $ column vector whose $\ \sigma^\text{th}\ $ entry is $\ \mu_\sigma\ $, then $$ \mu^T\Pi=\mu^T\ , $$ and if we multiply equation (\ref{eq1}) on the left by $\ \mu^T\ $ we get \begin{align} \mu^T\mathbf{1}&=1\\ &=\big(\mu^T-\mu^T+\pi_{ss'}\mu_s\mathbf{d}_{s'}^T\big)\mathbf{t}\\ &=\pi_{ss'}\mu_st_{s'}\ , \end{align} which is the same equation obtained in a_student's answer, which, when $\ s_0=s'\ne s\ ,$ gives \begin{align} \mathbb{E}\big[\tau_{ss'}\big]&=t_{s_0}-1\\ &=t_{s'}-1\\ &=\frac{1}{\mu_s\pi_{ss'}}-1\\ &=\frac{1}{\mu_{s}\mathbb{P}\big(X_{n+1}=s'\ \big|\,X_n=s \big)}-1\\ &\ge\frac{1}{\mathbb{P}\big(X_{n+1}=s'\ \big|\,X_n=s \big)}-1\\ &=\frac{\mathbb{P}\big(X_{n+1}\ne s'\ \big|\,X_n=s \big)}{\mathbb{P}\big(X_{n+1}=s'\ \big|\,X_n=s \big)}\\ &=\frac{\mathbb{P}\big(X_{n+1}\ne s_0\ \big|\,X_n=s \big)}{\mathbb{P}\big(X_{n+1}=s_0\ \big|\,X_n=s \big)}\ . \end{align}