Say that I have two probability densities $\mu(B) = \int_{B} f dx$ and $\nu(B) = \int_{B} g dx$ defined on $\mathbb{R}^2$. Say I want to consider computing the Wasserstein-2 distance given by \begin{equation*} W_2^2(\mu,\nu) = \inf_{T \in \mathcal{M}} \int_{\mathbb{R}^2} { |x-T(x)|^2 f(x) dx } \end{equation*} where $\mathcal{M}$ is the set of all maps $T : \mathbb{R}^2 \to \mathbb{R}^2$ such that $T_{*} \mu = \nu$. Rewrite as \begin{equation*} W_2^2(\mu,\nu) = \frac{1}{4\pi} \inf_{T \in \mathcal{M}} \int_{\mathbb{R}^2} { 4 \pi |x-T(x)|^2 f(x) dx } \end{equation*}
Now for a given transport map $T$, we have that the integral represents a sort of "solid of revolution" type of calculation where we are calculating the surface area of a solid of revolution with radius $|x - T(x)|$, except we now weight the integral according to the probability density $f$. Then we can think about it as a minimization problem over all transport maps of a surface area of a solid of revolution, weighted according to a probability density. I was wondering if anyone has made a more precise translation of this or if this helps build an intuition of the geometry of optimal transport. This may be a total coincidence though that does not lead to much insight. Just curious.