Weak convergence of measures is equivalent to convergence in Wasserstein distance

1.4k Views Asked by At

Weak convergence of measures on a measurable space $X$, denoted $\mu_n \xrightarrow{w} \mu$, is defined as $$ \int f d\mu_n \rightarrow \int fd\mu \;\;\;\; \text{ for all } f\in C_b(X)$$ where $C_b(X)$ is the set of continuous, bounded functions $X \rightarrow \mathbb{R}$.

I read that weak convergence is equivalent to convergence in 2-Wasserstein distance (optimal transport distance); the distance is given by $$W_2(\mu_n, \mu) = \inf_{p\in P(\mu_n,\mu)} \int c(x,y)^2 dp(dx dy)$$ where $P(\mu_n,\mu)$ is the set of probability measures with $\mu_n, \mu$ as marginals, and $c$ is a 'cost' pairing, $c: X\times X \rightarrow[0,\infty)$.

I'm trying to figure out why weak convergence and convergence in $W_2$ distance are equivalent. I guess $f$ and $c$ can be treated the same, but $f$ has to be continuous and bounded whereas $c$ simply has to be nonnegative. I don't see how $\int fd\mu_n - \int f d\mu \rightarrow 0$ implies $W_w(\mu_n,\mu) \rightarrow 0$.