Wasserstein Metric Inequality

125 Views Asked by At

This is the exercise:

This exercise shows that “spreading out” probability measures makes them closer together. Define the convolution of a measure by: for any probability density function $\phi$, let $\phi*\mu$ be the measure such that, for any Borel set $A$

$$ (\phi*\mu)[A]=\int\int\mathbb{I}_A(x)\phi(x-z)d\mu(z)dx. $$

  • i) To gain intuition, take $\delta_0$ as a measure (very not spread out!), and compute $\phi*\delta_0$ for any probability density function $\phi$. Intuitively, is this “more spread out” than $\delta_0$?

  • ii) Fix any compactly supported probability density function $\phi$. Show that for all $\mu,\nu\in\mathcal{P}_p\left(\mathbb{R}^d\right)$,

$$ \mathcal{W}_p(\phi*\mu,\phi*\nu)\le \mathcal{W}_p(\mu,\nu) $$

This is what I have done so far:

i)

Let $\phi$ be a probability density function and $\delta_0$ be Dirac's Delta distribution centered at $0$, thus by definition and considering Dirac's Delta distribution's properties we have that for any $A\in\mathcal{B}\left(\mathbb{R}^d\right)$

$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\mathbb{I}_A(x)\phi(x-z)d\delta_0(z)dx\\ &=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\phi(x-z)d\delta_0(z)\right)dx. \end{aligned} $$

For any fixed $x\in\mathbb{R}^d$ let's define $\Phi(z):=\phi(x-z)\in\mathcal{L}^1(\mathbb{R}^d)$ so that

$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\phi(x-z)d\delta_0(z)\right)dx\\ &=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\Phi(z)d\delta_0(z)\right)dx\\ &=\int_A\left(\int_{\mathbb{R}^d}\Phi(z)\delta_0(z)dz\right)dx, \end{aligned} $$

such that

$$ \delta_0(z)=\begin{cases}0&\text{ if }z\ne 0\\\infty&\text{ if }z= 0\end{cases}. $$

Therefore

$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_A\left(\int_{\mathbb{R}^d}\Phi(z)\delta_0(z)dz\right)dx\\ &=\int_A\Phi(0)dx=\int_A\phi(x)dx. \end{aligned} $$

Now, notice that for any $A\in\mathcal{B}\left(\mathbb{R}^d\right)$, the measure of set $A$ with respect to $\delta_0$ is given by

$$ \delta_0[A]=\int_Ad\delta_0(x)=\int_A\delta_0(x)dx=\begin{cases}1&\text{ if }0\in A\\0&\text{ if }0\notin A\end{cases}, $$

compare that to the measure of set $A$ with respect to $\phi*\delta_0$

$$ (\phi*\delta_0)(A)=\int_A\phi(x)dx\in[0,1], $$

clearly, $\delta_0$ is the unit mass concentrated at $0$. On the other hand, given the fact that $\phi$ is a non-negative Lebesgue-integrable function we can conclude that $(\phi*\delta_0)$ is not necessarily concentrated at $x_0$ (unless of course $\phi\equiv\delta_{x_0}$) for some $x_0\in\mathbb{R}$, that is $(\phi*\delta_0)$ is "more spread out" than $\delta_0$.

ii)

Let $\phi$ be a compactly supported probability density function and $\mu,\nu\in\mathcal{P}_p\left(\mathbb{R}^d\right)$, then by definition we have that

$$ \begin{aligned} \mathcal{W}_p(\phi*\mu,\phi*\nu)&=\inf\left\{\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\|x-y\|^pd\pi'(x,y):\pi'\in\Pi(\phi*\mu,\phi*\nu)\right\},\\ \mathcal{W}_p(\mu,\nu)&=\inf\left\{\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\|x-y\|^pd\pi(x,y):\pi\in\Pi(\mu,\nu)\right\}. \end{aligned} $$

And for $\rho(x,y)=h(x-y)=\|x-y\|^p$ we have that

$$ \begin{aligned} \mathcal{W}_p(\phi*\mu,\phi*\nu)&=C^k_{\rho}(\phi*\mu,\phi*\nu),\\ \mathcal{W}_p(\mu,\nu)&=C^k_{\rho}(\mu,\nu). \end{aligned} $$

Since $\phi$ is compactly supported, there exists a bounded, compact set $C\subset\mathbb{R}^d$ and $B<\infty$ such that

$$ \int_{C}\phi(y)dy=1,\text{ }\phi(x)=0\text{ for all }x\in C^c,\phi(x)\in[0,B]\text{ for all }x\in\mathbb{R}^d, $$

and by definition for any Borel set $A\subset\mathbb{R}^d$

$$ \begin{aligned} (\phi*\mu)[A]&=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\int_{\mathbb{R}^d}\phi(x-z)d\mu(z)dx\\&=\int_A\int_{\mathbb{R}^d}\phi(x-z)d\mu(z)dx\\ &=\int_A\int_{\left\{z\in\mathbb{R}^d:x-z\in C\right\}}\phi(x-z)d\mu(z)dx\\ &=\int_A\int_{-C+x}\phi(-z+x)d\mu(z)dx, \end{aligned} $$

such that for a fixed $x\in\mathbb{R}^d$ the set $-C+x$ is given by

$$ -C+x=\{z\in\mathbb{R}^d:z=-c+x\text{ for some }c\in C\} $$

Any ideas of what can I do in order to prove that:

$$ \mathcal{W}_p(\phi*\mu,\phi*\nu)\le \mathcal{W}_p(\mu,\nu) $$

?

2

There are 2 best solutions below

1
On

$\mathcal{W}_p(\mu,\nu) = \left(\inf_\pi \int c(x,y)^p d\pi(x,y)\right)^{1/p}$

Where the infimum is over all couplings $\pi$ of $\mu$ and $\nu$.

Now let $\pi'$ be any coupling of $\phi*\mu$ and $\phi*\nu$. We can construct a coupling $\pi$ of $\mu$ and $\nu$ as:

$\pi(A \times B) = \pi'(\phi(A) \times \phi(B))$

for any measurable sets $A, B$.

Then for any coupling $\pi'$ of $\phi*\mu$ and $\phi*\nu$, $\pi$ defined above is a valid coupling of $\mu$ and $\nu$.

Since $\phi$ is a convolution, we have:

$c(\phi(x),\phi(y)) \leq c(x,y)$

for the ground cost $c$.

Therefore:

$\int c(\phi(x),\phi(y))^p d\pi'(x,y) \leq \int c(x,y)^p d\pi(x,y)$

Taking the infimum over all $\pi'$ and $\pi$, this gives:

$\mathcal{W}_p(\phi*\mu,\phi*\nu) \leq \mathcal{W}_p(\mu,\nu)$

Thus proving the inequality.

0
On

You can find a beautiful proof of the advertised conclusion in Lemma 5.2 (pp 180) of the monograph Optimal transport for applied mathematicians written by Filippo Santambrogio