Density of pushforward measure on geodesic

233 Views Asked by At

If I have measures $\mu,\nu$ with densities $f,g$ respectively (in the sense of the Radon-Nikodym derivative) then I can connect them by the geodesic on $[0,1]$ as $$ \mu_t = ((1-t)Id + tT)_\#\mu $$ where $T$ is a map such that $$ (T)_\#\mu = \nu$$ naively I would expect the density of $\mu_t$ to be something like $$ (1-t)f + t\cdot g$$ Is there any known relationship describing the density of a pushforward measure?

2

There are 2 best solutions below

0
On

Interesting question! I'm also interested in an answer to it, however I have a counterexample to your expected $\mu_t$:

Let $\mu\sim\mathcal{N}(0,1)$ and $\nu\sim\mathcal{N}(2,1)$, i.e. two gaussian measures with variance 1 and mean 0 and 2 respectively. Then the shortest path between these two measures in Wasserstein-2 space will just be $\mu_t\sim\mathcal{N}(2t, 1)$, which is a univariate gaussian measure for every $t\in[0,1]$. Your approach on the other hand would continuously flatten one mean and raise the other.

This figure from Computational Optimal Transport should help visualize the situation (although I don't know how your idea precisely compares to KL-interpolation). https://arxiv.org/abs/1803.00567

I hope someone can give a more detailed answer!

0
On

As pointed out in the answer by @dirich1337, the density of the interpolating measures is not that simple.

There is the "classical change of density formula" (see e.g. Chapter 5.5 of Ambrosio, Gigli, Savaré: Gradient flows in metric space and in the space of probability measures): lemma 5.5.3 states that for $t \in (0, 1)$ the density of $\mu_t = (\Phi_t)_\# \mu$, where $$\Phi_t \colon \mathbb R^d \to \mathbb R^d, \qquad x \mapsto (1 - t) x + t T(x),$$ with respect to the Lebesgue measure $\mathcal L^d$ on $\mathbb R^d$ is $$ \frac{f}{|\det(\nabla \Phi_t)|} \circ \Phi_t^{-1}\big|_{\Phi(\mathbb R^d)} $$ under the assumption that $\det(\nabla \Phi_t) > 0$ almost everywhere (only then does $(\Phi_t)_{\#} \mu$ even have a density with respect to the Lebesgue measure) and if $\{ f > 0 \}$ and $\Sigma$, the set where $\Phi_t$ is injective, have equal measure under $\mathcal L^d$. (Note that by Remark 6.2.11 in the above mentioned book, we know that the transport map $T$ is $\mu$-essentially injective.)

Hence $(\nabla \Phi_t)(x) = (1 - t) I_{n \times n} + t \cdot \nabla T(x)$, if $T$ is differentiable almost everywhere. But this is fulfilled: since $\mu$ is absolutely continuous with respect to $\mathcal L^d$, Brenier's theorem asserts that there exists a convex function $\psi \colon \mathbb R^d \to \mathbb R$ with $T = \nabla \psi$. By Alexandrov's theorem (Thm. 5.5.4 in the above reference), $\nabla \psi$ is differentiable almost everywhere in its domain and its derivative $\nabla^2 \psi$ is a symmetric matrix.

We have $$x = \Phi_t(\Phi_t^{-1}(x)) = (1 - t) \Phi_t^{-1}(x) + t (T \circ \Phi_t^{-1})(x) \qquad \forall x \in \Phi_t(\mathbb R^d)$$ and $$x = \Phi_t^{-1}(\Phi_t(x)) = \Phi_t^{-1}((1 - t) x + t T(x)) \qquad \forall x \in \mathbb R^d,$$ but I don't know how to get an expression for $\Phi_t^{-1}$ from that, and I am not confident that this is possible, because on p. 213 (and in (10.4.6)) of the book mentioned above, where exactly your question is addressed, they don't present a more explicit formula than the one I gave.

Nevertheless, for explicit $f$ and $g$ you may be able to calculate the density of the interpolants explicitly.

Here's an example. Suppose $\mu$ is a Gaussian distribution with mean $m$ and covariance matrix $\Sigma_0 > 0$ and $T(x) := \Lambda x$ for some positive definite matrix $\Lambda > 0$. Then $g_\# \mu$ is a Gaussian with mean $\Sigma m$ and covariance matrix $\Lambda \Sigma_0 \Lambda$.

Now suppose $m = 0$. In order to find the matrix $\Lambda$ such that $g _\# \mu = \nu$, where $\nu$ is a Gaussian measure with zero mean and covariance matrix $\Sigma_1 > 0$ we have to solve the Ricatti equation $\Lambda \Sigma_0 \Lambda = \Sigma_1$ for $\Lambda > 0$. As it turns out the unique solution $\Lambda > 0$ is the geometric mean of $\Sigma_0$ and $\Sigma_1$: $$ \Lambda = \Sigma_0^{\frac{1}{2}} (\Sigma_0^{\frac{1}{2}} \Sigma_1 \Sigma_0^{\frac{1}{2}})^{-\frac{1}{2}} \Sigma_0^{\frac{1}{2}}. $$ Hence we have an explicit expression for the geodesic in Wasserstein space from $\mu$ to $\nu$.

(You can look up this and more in Robert McCann's seminal paper Convexity principle for interacting gases.)