I need some help understanding the following proof:

I don't really see how the first inequality is derived. I guess the authors are conditioning on whether $X_0^0 = X_0^p$ or $X_0^0 \neq X_0^p$, but I'm not sure. Also, I don't understand where the bound on the third probability comes from. Could you help me out? Thanks a lot!
It is not conditioning, just a decomposition of the event whose probability is to be bounded. Let $A=\{\Theta_\delta\le \delta^{-k+\varepsilon}\}$ and $B=\{X_t^\rho\ge k\text{ for some } t\in\{0,\delta^{\varepsilon/2},\dots,\lceil\delta^{-k+\varepsilon/2}\rceil\delta^{\varepsilon/2}\}\}$. Then \begin{align} A&=(A\cap\{X_0^0\ne X_0^\rho\})\cup(A\cap\{X_0^0=X_0^\rho\})\\ &\subset \{X_0^0\ne X_0^\rho\} \cup (A\cap\{X_0^0=X_0^\rho\})\\ &=\{X_0^0\ne X_0^\rho\} \cup (A\cap\{X_0^0=X_0^\rho\}\cap B) \cup (A\cap\{X_0^0=X_0^\rho\}\cap B^c)\\ &\subset \{X_0^0\ne X_0^\rho\} \cup B \cup (A\cap B^c). \end{align} In particular, $P(A)\le P(X_0^0\ne X_0^\rho) + P(B) + P(A\cap B^c)$, which is the first inequality you asked about. However I think the authors intended to use the bound $$P(A)\le P(X_0^0\ne X_0^\rho) + P(B) + P(A\cap \{X_0^0=X_0^\rho\}\cap B^c)$$ (note this follows from the same calculation). Here, the third term can be estimated using the coupling: observe that $$A\cap \{X_0^0=X_0^\rho\}\cap B^c \subset \{\Theta_\delta^\rho\le \delta^{-k+\varepsilon}\}\cap B^c,$$ where $\Theta_\delta^\rho=\inf\{t\ge 0:X_t^\rho=k\}$. One can then use the definition of $B^c$ and the comment about jump rates to obtain a suitable upper bound on $P(\{\Theta_\delta^\rho\le \delta^{-k+\varepsilon}\}\cap B^c)$.