Characterisation of almost sure convergence vs. convergence in probability

107 Views Asked by At

In a lecture on mathematical statistics we defined convergence in probability and convergence almost surely. The definitions are for general metric spaces so let me show them here with a general metric $d: S \to \mathbb{R}_{+}$ defining the distance. I actually find it easier to read and understand that way. Let $(X_n)_{n \in \mathbb{N}}$, X be random variables

  1. Almost sure (a.s.) convergence:

$$P(\underset{n \to \infty}{\lim} d(X_n, X) = 0) =1$$

  1. Convergence in probability:

$$\forall \varepsilon >0 \underset{n \to \infty}{\lim} P(d(X_n,X) > \varepsilon) = 0$$

We also had a criterion for a.s. convergence

Theorem

$ X_n \overset{n \to \infty}{\longrightarrow} X$ a.s. $\iff$ $\forall \varepsilon \in \mathbb{Q}: \underset{m \to \infty}{\lim} P(\underset{n \geq m}{\sup} d(X_n,X) > \varepsilon) = 0$ (*)

My problem in understanding lies here: I know the theorem is used later to prove that from a.s. convergence we can deduce convergence in probability. But seeing the form of the right hand side of the theorem I don't understand why the reverse does not hold. I know there are counter examples where the reverse is not true but they don't help me here on an intuitive level.

Suppose convergence in probability holds. So the measure of the differences converges to $0$. Now I see why for a fixed $m$ we could get that

$P(\underset{n \geq m}{\sup} d(X_n,X) > \varepsilon) > 0$

but if $m$ goes to infinity should I not reach a point where not even the supremum of the differences leaves the $\varepsilon$-corridor? What am I understanding wrong?