I have problems to understand the proof of Lemma 1 in the paper "Optimal Stopping for Levy Processes with one-sided Solutions" by Ernesto Mordecki and Yuliya Mishura:
Consider a Levy process $X$, $M=\sup\{X_t:t\geq 0\}$ which is assumed to be a proper random variable, a discount rate $r\geq 0$, integrable $g:\mathbb{R}\to[0,\infty)$ with $lim_{x\to -\infty}g(x)=0$, $x^*\in\mathbb{R}$, $G^*:[x^+,\infty)\to[0,\infty)$ measurable and nondecreasing with $E_x(G^*(M))=g(x)$ for all $x\geq x^*$ and $G$ is defined by $$ G(x)= \begin{cases} G^*(x), & \text{ if }x\geq x^*\\ 0, & \text{ if }x<x^* \end{cases} $$ and for every $a\geq x^*$ we have the stopping time $\tau_a=\inf\{t:X_t\geq a\}$.
Then we have for any $a\geq x^*$ and $x\in\mathbb{R}$ $$ E_x(G(M)1_{\{M\geq a\}})=E_x(e^{-r\tau_a}g(X_{\tau_a}1_{\{\tau^+<\infty\}})). $$ Proof.
Taking conditional expectation w.r.t. the $\sigma$-algebra $\mathcal{F}_{\tau_a}$ and applying the iterated law of conditional expectation, we have $$ E_x(G(M)1_{\{M\geq a\}}) \stackrel{(1)}{=}E_x(G(\sup_{0\leq t<\infty}X_t)1_{\{\tau_a<\infty\}}) $$ $$ \stackrel{(2)}{=}E_x(G(X_{\tau_a}+\sup_{\tau_a\leq t<\infty}(X_t-X_{\tau_a})1_{\{\tau_a<\infty\}}) $$ $$ \stackrel{(3)}{=}E_x(E_{X_{\tau_a}}(G(X_{\tau_a}+\sup_{0\leq s<\infty}X_s)1_{\{\tau_a<\infty\}})) $$ $$ \stackrel{(4)}{=}E_x(g(X_{\tau_a})1_{\{\tau_a<\infty\}}) $$ I don't understand step (3) and (4). In step three it seems that the Markov property is beeing used but I couln't find any theorem that would justify this step. In step (4) it seems to me that there is an $X_{\tau_a}$ too much. I hope anyone can explain at least step (3).
There is the following statement:
Back to your framework: If we define $Y_t := X_{\tau_a+t}-X_{\tau_a}$, then
$$\begin{align*} \mathbb{E}_x \bigg[ G \bigg( X_{\tau_a} + \sup_{\tau_a \leq t < \infty} (X_t-X_{\tau_a}) \bigg) 1_{\{\tau_a<\infty\}} \bigg] &= \mathbb{E}_x \big[ G(X_{\tau_a}+\sup_{t \geq 0} Y_t) 1_{\{\tau_a<\infty\}} \big] \\ &= \mathbb{E} \big[ G(x+X_{\tau_a}+\sup_{t \geq 0} Y_t) 1_{\{\tau_a<\infty\}} \big]\end{align*}.$$
Now we use the tower property and condition on $\mathcal{F}_{\tau_a}$, i.e. we use that
$$\mathbb{E}(\dots) = \mathbb{E} \bigg( \mathbb{E}( \dots \mid \mathcal{F}_{\tau_a}) \bigg).$$
This gives
$$\begin{align*} \mathbb{E}_x \bigg[ G \bigg( X_{\tau_a} + \sup_{\tau_a \leq t < \infty} (X_t-X_{\tau_a}) \bigg) \bigg] &= \mathbb{E} \bigg[ 1_{\{\tau_a<\infty\}} \mathbb{E} \big( G(x+X_{\tau_a}+\sup_{t \geq 0} Y_t) \mid \mathcal{F}_{\tau_a} \big) \bigg] \\ &= \mathbb{E} \bigg[ 1_{\{\tau_a<\infty\}} \mathbb{E} \big( G(y+\sup_{t \geq 0} Y_t) \big) \big|_{y=x+X_{\tau_a}} \bigg] \\ &= \mathbb{E}_x \bigg[ 1_{\{\tau_a<\infty\}} \mathbb{E} \big( G(y+\sup_{t \geq 0} Y_t) \big) \big|_{y=X_{\tau_a}} \bigg] \end{align*}$$
where we have used in the penultimate step that $(Y_t)_{t \geq 0}$ and $\mathcal{F}_{\tau_a}$ are independent and that $X_{\tau_a}$ is $\mathcal{F}_{\tau_a}$-measurable. Since Lévy processes are space homogeneous, we have
$$\mathbb{E} \big( G(y+ \sup_{t \geq 0} Y_t) \big) = \mathbb{E}_y(G(\sup_{t \geq 0} Y_t)) = g(y)$$
for all $y \geq x^*$; the last "=" follows from the fact that $(Y_t)_{t \geq 0}$ has the same distribution as $(X_t)_{t \geq 0}$. Consequently,
$$\mathbb{E}_x \bigg[ G \bigg( X_{\tau_a} + \sup_{\tau_a \leq t < \infty} (X_t-X_{\tau_a}) \bigg) \bigg] = \mathbb{E}_x(g(X_{\tau_a}) 1_{\{\tau_a<\infty\}}).$$