This is the exercise:
This exercise shows that “spreading out” probability measures makes them closer together. Define the convolution of a measure by: for any probability density function $\phi$, let $\phi*\mu$ be the measure such that, for any Borel set $A$
$$ (\phi*\mu)[A]=\int\int\mathbb{I}_A(x)\phi(x-z)d\mu(z)dx. $$
i) To gain intuition, take $\delta_0$ as a measure (very not spread out!), and compute $\phi*\delta_0$ for any probability density function $\phi$. Intuitively, is this “more spread out” than $\delta_0$?
ii) Fix any compactly supported probability density function $\phi$. Show that for all $\mu,\nu\in\mathcal{P}_p\left(\mathbb{R}^d\right)$,
$$ \mathcal{W}_p(\phi*\mu,\phi*\nu)\le \mathcal{W}_p(\mu,\nu) $$
This is what I have done so far:
i)
Let $\phi$ be a probability density function and $\delta_0$ be Dirac's Delta distribution centered at $0$, thus by definition and considering Dirac's Delta distribution's properties we have that for any $A\in\mathcal{B}\left(\mathbb{R}^d\right)$
$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\mathbb{I}_A(x)\phi(x-z)d\delta_0(z)dx\\ &=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\phi(x-z)d\delta_0(z)\right)dx. \end{aligned} $$
For any fixed $x\in\mathbb{R}^d$ let's define $\Phi(z):=\phi(x-z)\in\mathcal{L}^1(\mathbb{R}^d)$ so that
$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\phi(x-z)d\delta_0(z)\right)dx\\ &=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\left(\int_{\mathbb{R}^d}\Phi(z)d\delta_0(z)\right)dx\\ &=\int_A\left(\int_{\mathbb{R}^d}\Phi(z)\delta_0(z)dz\right)dx, \end{aligned} $$
such that
$$ \delta_0(z)=\begin{cases}0&\text{ if }z\ne 0\\\infty&\text{ if }z= 0\end{cases}. $$
Therefore
$$ \begin{aligned} (\phi*\delta_0)[A]&=\int_A\left(\int_{\mathbb{R}^d}\Phi(z)\delta_0(z)dz\right)dx\\ &=\int_A\Phi(0)dx=\int_A\phi(x)dx. \end{aligned} $$
Now, notice that for any $A\in\mathcal{B}\left(\mathbb{R}^d\right)$, the measure of set $A$ with respect to $\delta_0$ is given by
$$ \delta_0[A]=\int_Ad\delta_0(x)=\int_A\delta_0(x)dx=\begin{cases}1&\text{ if }0\in A\\0&\text{ if }0\notin A\end{cases}, $$
compare that to the measure of set $A$ with respect to $\phi*\delta_0$
$$ (\phi*\delta_0)(A)=\int_A\phi(x)dx\in[0,1], $$
clearly, $\delta_0$ is the unit mass concentrated at $0$. On the other hand, given the fact that $\phi$ is a non-negative Lebesgue-integrable function we can conclude that $(\phi*\delta_0)$ is not necessarily concentrated at $x_0$ (unless of course $\phi\equiv\delta_{x_0}$) for some $x_0\in\mathbb{R}$, that is $(\phi*\delta_0)$ is "more spread out" than $\delta_0$.
ii)
Let $\phi$ be a compactly supported probability density function and $\mu,\nu\in\mathcal{P}_p\left(\mathbb{R}^d\right)$, then by definition we have that
$$ \begin{aligned} \mathcal{W}_p(\phi*\mu,\phi*\nu)&=\inf\left\{\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\|x-y\|^pd\pi'(x,y):\pi'\in\Pi(\phi*\mu,\phi*\nu)\right\},\\ \mathcal{W}_p(\mu,\nu)&=\inf\left\{\int_{\mathbb{R}^d}\int_{\mathbb{R}^d}\|x-y\|^pd\pi(x,y):\pi\in\Pi(\mu,\nu)\right\}. \end{aligned} $$
And for $\rho(x,y)=h(x-y)=\|x-y\|^p$ we have that
$$ \begin{aligned} \mathcal{W}_p(\phi*\mu,\phi*\nu)&=C^k_{\rho}(\phi*\mu,\phi*\nu),\\ \mathcal{W}_p(\mu,\nu)&=C^k_{\rho}(\mu,\nu). \end{aligned} $$
Since $\phi$ is compactly supported, there exists a bounded, compact set $C\subset\mathbb{R}^d$ and $B<\infty$ such that
$$ \int_{C}\phi(y)dy=1,\text{ }\phi(x)=0\text{ for all }x\in C^c,\phi(x)\in[0,B]\text{ for all }x\in\mathbb{R}^d, $$
and by definition for any Borel set $A\subset\mathbb{R}^d$
$$ \begin{aligned} (\phi*\mu)[A]&=\int_{\mathbb{R}^d}\mathbb{I}_A(x)\int_{\mathbb{R}^d}\phi(x-z)d\mu(z)dx\\&=\int_A\int_{\mathbb{R}^d}\phi(x-z)d\mu(z)dx\\ &=\int_A\int_{\left\{z\in\mathbb{R}^d:x-z\in C\right\}}\phi(x-z)d\mu(z)dx\\ &=\int_A\int_{-C+x}\phi(-z+x)d\mu(z)dx, \end{aligned} $$
such that for a fixed $x\in\mathbb{R}^d$ the set $-C+x$ is given by
$$ -C+x=\{z\in\mathbb{R}^d:z=-c+x\text{ for some }c\in C\} $$
Any ideas of what can I do in order to prove that:
$$ \mathcal{W}_p(\phi*\mu,\phi*\nu)\le \mathcal{W}_p(\mu,\nu) $$
?
$\mathcal{W}_p(\mu,\nu) = \left(\inf_\pi \int c(x,y)^p d\pi(x,y)\right)^{1/p}$
Where the infimum is over all couplings $\pi$ of $\mu$ and $\nu$.
Now let $\pi'$ be any coupling of $\phi*\mu$ and $\phi*\nu$. We can construct a coupling $\pi$ of $\mu$ and $\nu$ as:
$\pi(A \times B) = \pi'(\phi(A) \times \phi(B))$
for any measurable sets $A, B$.
Then for any coupling $\pi'$ of $\phi*\mu$ and $\phi*\nu$, $\pi$ defined above is a valid coupling of $\mu$ and $\nu$.
Since $\phi$ is a convolution, we have:
$c(\phi(x),\phi(y)) \leq c(x,y)$
for the ground cost $c$.
Therefore:
$\int c(\phi(x),\phi(y))^p d\pi'(x,y) \leq \int c(x,y)^p d\pi(x,y)$
Taking the infimum over all $\pi'$ and $\pi$, this gives:
$\mathcal{W}_p(\phi*\mu,\phi*\nu) \leq \mathcal{W}_p(\mu,\nu)$
Thus proving the inequality.