Chain rule for Wasserstein distance?

439 Views Asked by At

Suppose that $\mu \in \mathcal{P}(\mathbb{R}^{n \times n})$ is a probability measure on $\mathbb{R}^{n\times n}$, and $\mu_1(dx) \in \mathcal{P}(\mathbb{R}^{n})$ is its first marginal, and $\mu_2(dy \mid x)$ is a stochastic kernel from $\mathbb{R}^{n}$ to $\mathcal{P}(\mathbb{R}^{n})$. Then, $$ \mu(dx \times dy) = \mu_1(dx) \times \mu_2(dy \mid x) $$ satisfies.

Assume that $\nu_1 \in \mathcal{P}(\mathbb{R}^{n})$ and $\nu_2 \in \mathcal{P}(\mathbb{R}^{n})$ is given constant and indenpendent probability measures, and $$\nu(dx \times dy) := \nu_1(dx) \times \nu_2(dy).$$

Now consider the order-2 Wasserstein distance, $$ W_2(\mu, \nu)^2 := \inf_{\eta \in \mathcal{P}(\mathbb{R}^{n \times n \times 2})} \left\{ \int_{\mathbb{R}^{n \times n \times 2}} \Vert x-y\Vert^2 d\eta(x,y) \mid \Pi^1\eta = \mu, \Pi^2\eta = \nu \right\}, $$ where $\Pi^i$ denotes $i$th marginal. What I want to know is that does chain rule (factorization) hold on Wasserstein distance? such that $$ W_2(\mu, \nu)^2 = W_2(\mu_1, \nu_1)^2 + \int_{\mathbb{R}^{n}} W_2(\mu_2, \nu_2)^2 d\mu_1(x). $$ I know that this property holds for relative entropy, but is it true for the Wasserstein metric?

1

There are 1 best solutions below

1
On

Yes. See Computational Optimal Transport chapter 9.1 "Differentiating the Wasserstein Loss".