I'm trying to reconstruct the proof of this theorem from my lecture note. Could you please verify if my proof is correct or contains logical mistake? Thank you so much!
Let $\left(\Omega, \mathcal{F},(\mathcal{F}_{n}), \mathbb P\right)$ be a filtered probability space. If $(X_{n})$ is a non-positive sub-martingale and $S \le T$ are finite stopping times. Then $\mathbb E [ X_T | \mathcal F_S ] \ge X_S$.
My attempt:
First, we need the following lemmas:
$\textbf{Lemma 1}$ If $(X_{n})$ is a sub-martingale w.r.t $(\mathcal{F}_{n})$ and $\phi: \mathbb R \to \mathbb R$ is non-decreasing convex. Then $(\phi(X_{n}))$ is also a sub-martingale w.r.t $(\mathcal{F}_{n})$.
$\textbf{Lemma 2}$ If $(X_{n})$ is a bounded sub-martingale w.r.t $(\mathcal{F}_{n})$ and $S \le T$ are finite stopping times. Then $(X_S, X_T)$ is also a sub-martingale w.r.t $(\mathcal{F}_{S}, \mathcal{F}_{T})$.
$\textbf{Lemma 3}$ If $\mathcal G$ is a sub $\sigma$-algebra of $\mathcal F$ and $X:\Omega \to \mathbb R$ is integrable. Then $$\mathbb E[X | \mathcal G] \ge 0 \quad \text{a.s} \iff \forall \Lambda \in \mathcal G: \mathbb E [\mathbf{1}_\Lambda X] \ge 0$$
For each $m \in \mathbb N$, we define a map $\phi_m: \mathbb R \to \mathbb R, \quad x \mapsto \max \{x, -m\}$. It's easy to verify that $\phi_m$ is non-decreasing convex and $\phi_m (x) \downarrow\ x$ as $m \to \infty$. By $\textbf{Lemma 1}$, $(\phi_m(X_{n}))_{n \in \mathbb N}$ is a sub-martingale.
Because $\phi_m$ is non-decreasing and bounded from below by $-m$, and $X_n$ is bounded from above by $0$, $(\phi_m(X_{n}))_{n \in \mathbb N}$ is bounded. Then, by $\textbf{Lemma 2}$, $(\phi_m(X_{0}),\phi_m(X_{S}))$ is a sub-martingale w.r.t $(\mathcal F_0, \mathcal F_S)$ and thus $-\infty < \mathbb E [X_{0}] \le \mathbb E [\phi_m(X_{0})] \le \mathbb E [\phi_m(X_{S})]$.
By $\textbf{Monotone Convergence Theorem}$, we get $\mathbb E [\phi_m(X_{S})] \to \mathbb E [X_S]$ as $m \to \infty$. Hence $\mathbb E [X_{0}] \le\mathbb E [X_S] \le 0$. Thus $X_S$ is integrable. Similarly, $X_T$ is integrable.
By $\textbf{Lemma 2}$ again, we get $(\phi_m(X_{S}),\phi_m(X_{T}))$ is a sub-martingale w.r.t $(\mathcal F_S, \mathcal F_T)$ and thus $\mathbb E[\phi_m(X_{T})-\phi_m(X_{S}) | \mathcal F_S] \ge 0$. By $\textbf{Lemma 3}$, this is equivalent to $$\forall \Lambda \in \mathcal F_S: \mathbb E [\mathbf{1}_\Lambda \phi_m(X_{T})] \ge \mathbb E [ \mathbf{1}_\Lambda \phi_m(X_{S}) ]$$
By $\textbf{Monotone Convergence Theorem}$ again, we get $\mathbb E [\mathbf{1}_\Lambda \phi_m(X_{S})] \to \mathbb E [\mathbf{1}_\Lambda X_S]$ and $\mathbb E [\mathbf{1}_\Lambda \phi_m(X_{T})] \to \mathbb E [\mathbf{1}_\Lambda X_T]$ as $m \to \infty$. As a result, $\forall \Lambda \in \mathcal F_S: \mathbb E [\mathbf{1}_\Lambda X_{T}] \ge \mathbb E [ \mathbf{1}_\Lambda X_{S} ]$. By $\textbf{Lemma 3}$ again, we get $\mathbb E [X_{T} - X_{S} | \mathcal F_S] \ge 0$.