Algorithms to compute the Wasserstein distance between two discrete probability measures.

2.6k Views Asked by At

Let $\mu$ & $\nu$ be a pair of discrete probability measures on $\mathbb{R}^d$ defined by

$$\mu = \frac{1}{n}\sum_{i=1}^n \delta_{x_i} \quad ;\quad \nu= \frac{1}{n}\sum_{i=1}^n \delta_{y_i}$$

where $x_i, y_i \in \mathbb{R}^d$. We can think of $\mu$ and $\nu$ as denoting two distributions of $n$ shipping crates on $\mathbb{R}^d$ where crates have equal mass and can be stacked on top of one another. I want to find an algorithm to compute the Wasserstein distance between $\mu$ and $\nu$. The Wasserstein distance is given by

$$ W_p(\mu,\nu) = \min_{\sigma \in S_n} \left\{ \frac{1}{n} \sum_{i=1}^n |x_i-y_{\sigma(i)} |^p \right\}. \hskip20pt (*)$$

I.e. The Wasserstein distance gives the minimum cost necessary to move all crates from distribution $\mu$ to distribution $\nu$, where the cost for moving a crate from $x$ to $y$ in this case is $\frac{1}{n}|x - y|^p$. See e.g. Villani - Topics in Optimal Transport Theory, page 5.

The minimizer clearly exists because $S_n$ is finite, but (at least naively) it seems to be difficult to compute in practice because $|S_n| = n!$. Looking online I find some hints that the problem might be solvable in polynomial time, but I become stooped in complicated general theory which does not immediately seem to apply to this problem. I am not familiar with either optimal transport theory or optimization. I am most interested in the case $p=1$, though the cases $p\in(1,\infty)$ are also of interest.

Can anyone provide either:

(i) An account of existing theory/algorithms for computing (*), ideally in the case $p=1$ but also possibly for general $p\in[1,\infty)$, or

(ii) Some clear references covering (i).

Many thanks! A.

1

There are 1 best solutions below

4
On BEST ANSWER

In your case this could be seen as an instance of the Assignment Problem, which you can solve using the Hungarian Method in $\mathcal{O}(n^3)$.