Hi I'm looking for some intuition into Kantorovitch's statement of the Optimal Transport Problem. Specifically some help with interpretation of the diagrams https://speakerdeck.com/gpeyre/numerical-optimal-transport-1?slide=26.
Monge's problem asks for the optimal (w.r.t some cost function) mapping $T$ which takes one measure $\alpha$ on $\mathcal{X}$ to another $\beta$ on $\mathcal{Y}$ via the push-forward $T_{\#}\alpha=\beta$. For me the map $T:\mathcal{X}\to \mathcal{Y}$ has a physical interpretation which tells us exactly how to transport mass, that is any mass at $x$ is transported to $T(x)$.
A problem with Monge's formulation is that it does not allow for the splitting of mass. A relaxation of the problem is Kantorovitchs formulation which instead asks for a optimal measure $\pi$ on $\mathcal{X}\times \mathcal{Y}$ with marginals $\alpha$ and $\beta$.
$Question : $ How should I physically interpret the mesaure/plan $\pi$? Specifically can someone describe the continuous diagram in the link provided?
You can view the coupling $\pi$ as a non-deterministic version of $T$, because functions cannot map the same value to different points.
One advantage of using an optimal coupling $\pi$ as transport plan is that it exists way more often than a map $T$, and if you have a $\pi$ which is an optimal transport from one measure $\mu$ to another measure $\nu$, then you always get an optimal transport from $\nu$ to $\mu$ as well - in the finite case when $\pi$ is a matrix that would be the transpose $\pi^T$.
The line you see in that image tells you how much mass gets transported from the density $\alpha$ to points of mass in the density $\beta$. I always imagine they are piles of sand, so you can push all the "grains of sand" from $\alpha$ horizontally until they hit the curve, and then push them up vertically to become $\beta$.
It might be useful to make a discrete example with distributions which look similar to $\alpha$ and $\beta$ and see how the matrix $\pi$ changes as you add more and more bins.