Markov optimal stopping time infite horizon

57 Views Asked by At

I would like to show the following statement:

Let $\{X_t:t\geqslant 1\}$ be a Markov chain, with finite states $S$ and transition probabilies $p_{ij}, i,j\in S$. Let $S_r$ and $S_t$ its recurrent and transient states. Let $g: S \to\mathbb R^+$ such that $g(s) = 0, \forall s \in S_r$. Let $T$ be the set of stopping times over $τ = 1,2,3,\ldots$ that is $τ\in T $ is a random variable, its values are in $\mathbb N$, and $\{τ = t\}\in\sigma(X_1,\ldots,X_t)$ for each $t\in\mathbb N$.

Given an initial value $X_1 = i$ let's define the optimal stopping problem as $v^*(i)$ such as $$v^*(i) = \sup\{E_i[g(Xτ)] : τ ∈T\}$$ where $E_i$ is the expected value when the initial value of the chain equals $i$.

Let's prove that $v^*$ is the unique solution of the equation: $$v(i) =\max \left\{g(i), \sum_{j} p_{ij} v(j)\right\}, \forall i \in S$$

where $v(i) = 0$, $i \in S_r$.

Moreover, the following assumption holds: $v^* $ is the minimum solution of the inequality system: $$ v(i) \geq \max \{g(i), \sum_{j} p_{ij} v(j)\}.$$

My approach: The equality to show reminds me the Wald–Bellman equation. There are some results regarding the Wald–Bellman equation and harmonic functions but i'm not sure how to show the uniqueness of the function. Furthermore, the Wald–Bellman results do not remark that $v(i) = 0, i\in S$.