An intriguing duality gap for Wasserstein distance for Gaussian distributions

265 Views Asked by At

Let $P=N(a,\sigma^2 I)$ and $Q=N(b, \sigma^2 I)$ be two Gaussian distributions in $\mathbb{R}^d$. We know that the Wasserstein distance $P$ and $Q$ is defined as $$ W_2^2(P,Q) \triangleq \inf_{\pi \in \Pi(P,Q)} \int_{\mathbb{R}^d} \|x-y \|^2 d \pi(x,y), $$ where $\Pi(P,Q)$ denotes all the couplings of $P$ and $Q$. It is also a well-known fact that $$ W_2^2(P,Q)=\inf_{T\#P=Q}\int_{\mathbb{R}^d} \|T(x)-x \|^2 dx=\sup_{f,g:f(x)+g(y)\leq \|x-y\|^2} \int f \ dP + \int g \ dQ, $$ where $T\#P$ refers to the push-forward of $P$ under the map $T$. Moreover, the optimal transport map $T$ is related to the optimal dual solution $f$ according to $T(x)=x-\nabla f(x)$. In our speicific case of Gaussians, it is known that $T(x)=x+b-a$ and thus $W_2^2(P,Q)=\|a-b\|^2$.

My question is regarding the computation of dual using these optimal $f$ and $g$. It follows from the relation between $T$ and $f$ that $f(x)=x^\top(a-b)$. Also, the optimal $g$ is related to the $f$ according to $g(y)=f^c(y) \triangleq \inf_{x \in \mathbb{R}^d}\|x-y\|^2 - f(x)$ which simplifies to $g(y)=-y^\top(a-b)-\frac{\|a-b\|^2}{4}$. Thus the dual optimum equals $$ \int f \ dP + \int g \ dQ =a^\top(a-b)-b^\top(a-b)-\frac{\|a-b\|^2}{4}=\frac{3}{4} \|a-b \|^2. $$

I don't see any mistake in the above procedure but also don't understand why the extra factor $3/4$ appears. Can anyone explain which step of the above computation is probably incorrect?

1

There are 1 best solutions below

2
On BEST ANSWER

Your formula $T(x)=x-\nabla f(x)$ is not correct. It is correct if you take $\frac{1}{2}|x-y|^2$ as the cost. In general with cost function $c(|x-y|)$ the formula is $T(x)=x-\nabla c^*(\nabla f(x))$ where $c^*$ is the convex conjugate of $c$ (see Thm 2.44 in C. Villani, topics in optimal transportation). For example in your case the formula is $T(x)= x - \frac{1}{2}\nabla f(x)$ which would lead to $f(x)=2x^\top(a-b)$ and $g(y) = -2y^\top(a-b)-|a-b|^2$. The resulting value for the objective function is $2a^\top(a-b) - 2b^\top(a-b) - |a-b|^2= |a-b|^2$ as expected