Why does $\alpha(x) \mathbb E^{x}[\tau_{x}]=\mathbb E^{\alpha}\left[1_{\{X_{0}=x\} }\tau_{x}\right]$

59 Views Asked by At

Let $\alpha$ be the stationary distribution on the countable state space $E$, and let $x \in E$.

I am struggling with a step in the proof that $\alpha(x) \mathbb E^{x}[\tau_{x}]=P^{\alpha}\left(\tau_{x}<\infty\right)$

Note that $P^{\alpha}$ denotes the distribution, with starting point according to distribution $\alpha$, while $P^{x}$ denotes the distribution with almost surely starting in state $x$.

My Problem:

Why is

$\alpha(x) \mathbb E^{x}[\tau_{x}]=\mathbb E^{\alpha}\left[1_{\{X_{0}=x\} }\tau_{x}\right]$

It obviously has something to do with the stationarity but I cannot find an appropriate justification.

Any help is greatly appreciated.

1

There are 1 best solutions below

8
On BEST ANSWER

If $\alpha(x)=0$, then both sides are 0 and the statement is trivially true.

If not, then $$\mathbb{E}^x[\tau_x] = \mathbb{E}^{\alpha}[\tau_x \mid X_0 = x] = \frac{\mathbb{E}^\alpha[1_{\{X_0 = x\}} \tau_x]}{\mathbb{P}^\alpha(X_0 = x)} = \frac{\mathbb{E}^\alpha[1_{\{X_0 = x\}} \tau_x]}{\alpha(x)}$$ by the Markov property, the definition of conditional expectation, and the definition of $\mathbb{P}^\alpha$.

It would be true for any probability distribution $\alpha$; it doesn't have to be the stationary distribution.

Update. The claim that $\mathbb{E}^x[\tau_x] = \mathbb{E}^{\alpha}[\tau_x \mid X_0 = x]$ is a consequence of the Markov property; but it does take a certain amount of (elementary but tedious) work to prove it from the original definition of the Markov property $$\mathbb{P}(X_{n+1}=x_{n+1}|X_{n}=x_{n},....,X_{0}=x_{0})=\mathbb{P}(X_{n+1}=x_{n+1}|X_{n}=x_{n}). \tag{MP}$$ First, we need to be clear on the definition of $\mathbb{P}^\alpha$ and $\mathbb{P}^x$. $\mathbb{P}^\alpha$ should be defined as some probability measure on the sample space $(\Omega, \mathcal{F})$ such that:

  1. The process $(X_0, X_1, \dots)$ is still a Markov chain under $\mathbb{P}^\alpha$; i.e. (MP) holds for $\mathbb{P}^\alpha$ whenever $\mathbb{P}^\alpha(X_n = x_n, \dots, X_0 = x_0) > 0$.

  2. The process $(X_0, X_1, \dots)$ has the same transition probabilities under $\mathbb{P}^\alpha$ as under $\mathbb{P}$; i.e. we have $\mathbb{P}^\alpha(X_{n+1} = z \mid X_n = y) = \mathbb{P}(X_{n+1} = z \mid X_n = y)$ whenever $\mathbb{P}^\alpha(X_n = y) > 0$.

  3. $\mathbb{P}^\alpha(X_0 = y) = \alpha(y)$ for all $y \in E$.

Note that all of this makes sense for any probability distribution $\alpha$ on $E$, not necessarily a stationary distribution. $\mathbb{P}^x$ is defined similarly except that in place of (3), we have $\mathbb{P}^x(X_0 = x) = 1$.

Let us assume throughout that $\alpha(x) > 0$ since otherwise, as noted, everything is trivial.

The basic idea is that the process $X_n$ looks the same under $\mathbb{P}^\alpha$ conditioned on $X_0 = x$ as it does under $\mathbb{P}^x$. Specifically:

Lemma 1. For every $x_1, \dots, x_n$, we have $$\mathbb{P}^\alpha(X_n = x_n, \dots, X_1 = x_1 \mid X_0 = x) = \mathbb{P}^x(X_n = x_n, \dots, X_1 = x_1).$$

Proof. By induction on $n$. For $n=1$, we have, from (2) of the definition, $$\mathbb{P}^\alpha(X_1 = x_1 \mid X_0 = x) = \mathbb{P}(X_1 = x_1 \mid X_0 = x) = \mathbb{P}^x(X_1 = x_1 \mid X_0 = x) = \mathbb{P}^x(X_1 = x_1)$$ where the last equality is because $\mathbb{P}^x(X_0 = x) = 1$. For the inductive step, suppose the lemma holds for $n$. Now by conditioning, we have

$$\begin{align} \mathbb{P}^\alpha(X_{n+1} = x_{n+1}, \dots, X_1 = x_1 \mid X_0 = x) &= \mathbb{P}^\alpha(X_{n+1} = x_{n+1} \mid X_n = x_n, \dots, X_0 = x) \cdot\mathbb{P}^\alpha(X_n = x_n, \dots, X_1 = x_1 \mid X_0 = x) &&\text{conditioning} \\ &= \mathbb{P}^\alpha(X_{n+1} = x_{n+1} \mid X_n = x_n) \cdot \mathbb{P}^\alpha(X_n = x_n, \dots, X_1 = x_1 \mid X_0 = x) && \text{Markov property} \\ &= \mathbb{P}^x(X_{n+1} = x_{n+1} \mid X_n = x_n) \cdot \mathbb{P}^x(X_n = x_n, \dots, X_1 = x_1 \mid X_0 = x) && \text{def. 2 and induction hyp.} \\ &= \mathbb{P}^x(X_{n+1} = x_{n+1} \mid X_n = x_n, \dots, X_0 = x) \cdot\mathbb{P}^x(X_n = x_n, \dots, X_1 = x_1 \mid X_0 = x) && \text{Markov property}\\ &= \mathbb{P}^x(X_{n+1} = x_{n+1}, \dots, X_1 = x_1 \mid X_0 = x) && \text{conditioning} \\ &= \mathbb{P}^x(X_{n+1} = x_{n+1}, \dots, X_1 = x_1) && (\mathbb{P}^x(X_0 = x) = 1) \end{align}$$ I leave it as an exercise to verify that the claim still holds in case any of the events being conditioned on above have probability zero.

Now using this, we can show:

Lemma 2. For any $k$ and any $y \in E$, we have $\mathbb{P}^\alpha(\tau_y = k \mid X_0 = x) = \mathbb{P}^x(\tau_y = k)$.

Proof. Note that the event $\{\tau_y = k\}$ is just the event $\{X_k = y, X_{k-1} \ne y, \dots, X_1 \ne y\}$, so $$\begin{align*} \mathbb{P}^\alpha(\tau_y = k \mid X_0 = x) &= \mathbb{P}^\alpha(X_k = y, X_{k-1} \ne y, \dots, X_1 \ne y \mid X_0 = x) \\ &= \sum_{x_{k-1}, \dots, x_1 \in E \setminus \{y\}} \mathbb{P}^\alpha(X_k = y, X_{k-1} = x_{k-1}, \dots, X_1 = x_1 \mid X_0 = x) \\ &= \sum_{x_{k-1}, \dots, x_1 \in E \setminus \{y\}} \mathbb{P}^x(X_k = y, X_{k-1} = x_{k-1}, \dots, X_1 = x_1) \\ &= \mathbb{P}^x(\tau_y = k). \end{align*}$$

Finally, to prove the desired claim, we note $$\begin{align*}\mathbb{E}^\alpha[\tau_y \mid X_0 = x] &= \sum_{k=1}^\infty k \cdot\mathbb{P}^\alpha(\tau_y = k \mid X_0 = x) \\ &= \sum_{k=1}^\infty k \cdot\mathbb{P}^x(\tau_y = k) \\ &= \mathbb{E}^x[\tau_y]. \end{align*}$$