One-sentence summary: I want an intuitive explanation for why closeness of probability measures (in terms of, eg., TV or Lévy-Prokhorov) implies the existence of good couplings.
Set up. Consider a separable metric space $(X,d)$ and the space of probability measures $\mathcal{M}(X)$. Equip $\mathcal{M}(X)$ with one of the "usual" metrics $\rho$: for the concreteness of this post let's consider $\rho$ to be Lévy-Prokhorov or total variation distance (TVD) (it is a theorem that $\mathcal{M}(X)$ is separable under either of these metrics).
It is a theorem that, if $\mu_1, \mu_2 \in \mathcal{M}(X)$ satisfy $\rho(\mu_1, \mu_2) <\epsilon$ for one of these choices of $\rho$, then there exists a "good" coupling: i.e., there exists a measure $\mu$ on $X\times X$ such that the random variables $(Y, Z) \sim \mu$ satisfy
- $Y \sim \mu_1$ and $Z \sim \mu_2$, and
- $d(Y,Z) < \epsilon$ with $\mu$-probability $> 1-\epsilon$. (I know we have an even stronger statement when $\rho$ is taken to be TVD, but that's not the point of this post-- this one is enough for me) (Ref: Billingsley, Convergence of Probability Measures, Theorem 6.9)
The Question. Is there an intuitive way to understand this theorem, or more generally the relation between these distances on spaces of probability measures (which seems abstract to me) and good/optimal couplings (which seems concrete to me)? EDIT: maybe some kind of theorem that says closeness of the laws "projects down" to closeness of the random variables?
I have taken a look at the proofs of the above result and the famous Coupling Lemma (for TVD), but both proofs are rather opaque to me--- they feel like a bit of abstract nonsense, but I am hoping that I do not have the right understanding.
Thank you!
Certainly not. Two random variables can have the same law but be highly distant as random variables.
However it's certainly true that we can go the other way: If two random variables are close in some sense then their laws are close. In both the examples you're thinking of this is easy to see.
The converse of "If two random variables are close then their laws are close" is not true, but "If two random variables are close then their laws are close" implies that "If there exists a coupling of two distributions by two random variables that are close, then their laws are close", and that weaker statement does have a true converse.
But the crucial part of proving the converse is finding a coupling that makes the random variable close. This requires some construction of a coupling.
I think this construction is the part you find to be abstract nonsense? But while there could possibly be many different sorts of constructions, there isn't going to be an explanation without some kind of construction (or related abstract argument), since the hard part of the statement is the existence of something.