I recently learned about optimal transport (OT) and its generalization to comparing multiple distributions jointly, called multi-marginal optimal transport (MMOT)
In a nutshell, the OT does
$ \inf_{{\bf p}^{1,2}} \int_{{\Omega}^{1} \times {\Omega}^{2}} d \;\; {\rm d}{\bf p}^{1,2} \text{ subject to } \int_{{\Omega}^{1}} {\rm d}{\bf p}^{1,2} = {\bf p}^{2} \text{ and } \int_{{\Omega}^{2}} {\rm d}{\bf p}^{1,2} = {\bf p}^{1},$
for some cost $d$ and the MMOT does
$\inf_{{\bf p}}\int_{{\Omega}^{1} \times\dots\times {\Omega}^{n}} \hspace{0cm} d \;\; {\rm d}{\bf p} \text{ subject to } \int_{{\Omega}^1\times\dots\times{\Omega}^{i-1}\times{\Omega}^{i+1}\times\dots\times{\Omega}^n} \hspace{0cm} {\rm d}{\bf p} = {\bf p}^{i} \forall i = 1,\dots,n,$
for some multi cost $d$.
My question is: why is using join transports like in MMOT better than simply computing all pairwise transports between n distributions? Is there any concrete application where this shows it self?