I was searching for the proof to "Convergence in probability implies convergence in distribution" and I found it both online and in Grimmet's Probability and Random Processes book, both proofs used the following lemma, \begin{align} \operatorname{Pr}(Y\leq a) &= \operatorname{Pr}(Y\leq a,\ X\leq a+\varepsilon) + \operatorname{Pr}(Y\leq a,\ X>a+\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X\leq a-X,\ a-X<-\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) \\ &\leq \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(Y-X<-\varepsilon) + \operatorname{Pr}(Y-X>\varepsilon)\\ &= \operatorname{Pr}(X\leq a+\varepsilon) + \operatorname{Pr}(|Y-X|>\varepsilon) \end{align}
However I don't understand the passage from line 3 to 4 ,could someone explain to me why $ \operatorname{Pr}(Y-X\leq a-X,\ a-X<-\varepsilon) \leq\operatorname{Pr}(Y-X<-\varepsilon)$, please? It seems simple, but i don't get it.
If both $Y - X \leq a - X$ and $a- X < - \varepsilon$, then $Y - X \leq a - X < - \varepsilon$, so
$$\{ Y - X \leq a - X \} \cap \{ a - X < - \varepsilon \} \subset \{ Y - X < -\varepsilon \}.$$
Then just use the fact that $P(A) \leq P(B)$ whenever $A \subset B$ to get
$$P(\{ Y - X \leq a - X \} \cap \{ a - X < - \varepsilon \}) = P(Y - X \leq a - X, a - X < \varepsilon) \leq P( Y - X < -\varepsilon).$$