Optimal Transport and Entropic Regularization

437 Views Asked by Bumbble Comm At 13 Apr 2026 - 1:09

We are working with discrete optimal transport.

Let $P$ be a matrix and let $H(P) =- \sum_{i,j} P_{i,j} (\log(P_{i,j})-1)$.

Let $C$ be the cost matrix. And $\langle C,P\rangle$ the Frobenius inner product.

We introduce the regularized optimal transport problem $\min_{P \in U(a,b)} \langle C,P\rangle + \epsilon H(P)$.

We want to prove that as $\epsilon \to 0$, $P_\epsilon$ converges to an optimal solution to the original Kantrovich problem with maximal entropy.

I understand the proof up to the point where it says for any subsequence of $P_\epsilon$, we can choose a sub-subsequence of it that converges to an optimal transport plan with maximum entropy.

Question 1) The part I don't get is when it says by strict convexity of $-H$, we get $P^* = P_0^*$. It is clear that $-H$ is strictly convex, but you still need $-H$ on a convex set. It seems we are only looking at optimal points in the Kantrovich problem, which is not a convex set.

Question 2) It says that as $\epsilon \to \infty$, $P$ gets less sparse, but I would have thought the opposite since more entropy means more uncertainties.

Thank you!

Original Q&A

There are 1 best solutions below

Bumbble Comm On 28 Mar 2022 - 2:42

Let $\mathcal{X}$ be the discrete set for which $a,b$ are measures on.

With respect to your first question, consider the sequence $\epsilon_l\to 0$, $\epsilon_l>0$. From what I gather you can see that for some subsequence (which for clarity I will relabel) $\lim P_{\epsilon_{l_{k}}}$ exists and we call it $P^*$. Moreover, the limit is such that $P^*=\text{argmin}\{-H(P)~:~P\in U(a,b),\langle P,C\rangle=L_C(a,b)\}.$ It is also clear that $-H$ is strictly convex.

Now $U(a,b)$ is a convex set, indeed let $P,Q \in U(a,b)$ and let $\mathcal{Y}$ be some measurable subset of $\mathcal{X}$, then $\lambda P(\mathcal{X}\times \mathcal{Y})+(1-\lambda) Q(\mathcal{X}\times \mathcal{Y})=\lambda b(\mathcal{Y})+(1-\lambda) b(\mathcal{Y})=b(\mathcal{Y})$, we can do the same for the other marginal. Hence, the limit to the full sequence $\lim P_{\epsilon_l}$ is unique, we call it $P_0^*$. But since $\epsilon_{l_{k}}$ is a subsequence of $\epsilon_{l}$ is must be that that $P^*=P_0^*$.

For question 2), the sparsity of a matrix is describes the number of zeros it has. The more zeros the more sparse. Adding entropy is like adding diffusion (or for PDE a Laplacian). It blurs the optimal transport plan, spreading out mass, hence decreasing the number of zeros in the optimal plan i.e it becomes less sparse.

Optimal Transport and Entropic Regularization

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in REGULARIZATION

Related Questions in OPTIMAL-TRANSPORT

Trending Questions

Popular # Hahtags

Popular Questions