Are max entropy distributions under a norm constraint Lipschitz with respect to moving the constraint?

167 Views Asked by At

Let $p$ be a probability distribution and let $f(p)$ be the maximum-entropy distribution that is $\varepsilon$-close to $p$ in some norm. I conjecture that the function $f$ is 1-Lipschitz. Formally, we have the following statement.

Conjecture: Fix $d \in \mathbb{N}$ and $\varepsilon \ge 0$. Let $\|\cdot\|$ be an arbitrary norm on $\mathbb{R}^d$. Let $\Delta \subseteq \mathbb{R}^d$ be the space of probability distributions on a set of size $d$ - i.e., $\Delta = \left\{p \in \mathbb{R}^d : \forall i ~ p_i \ge 0, \sum_i^d p_i = 1\right\}$. For $p \in \Delta$, define the entropy $h : \Delta \to[0,\log d]$ by $h(p) = \sum_i^d p_i \log (1/p_i)$ (for continuity, define $0 \log (1/0) = 0$).

Define $f : \Delta \to \Delta$ by $f(p) = \mathrm{argmax}_q ~ h(q)$, where $q \in \Delta$ must satisfy $\|q-p\|\le\varepsilon$. Note that $f$ is well-defined because the entropy function is strictly concave and the constraint set is closed and convex, as it is a ball under the norm.

We claim that, for all $p,p' \in \Delta$, $\| f(p) - f(p') \| \le \| p - p' \|$.

Some intuition: If $\varepsilon=0$, then $f$ is the identity function and the claim is trivially true. If $\varepsilon$ is sufficiently large (specifically, larger than the radius -- i.e., $\varepsilon \ge \sup_{p \in \Delta} \|p-u\|$, where $u=(1/d,1/d,\cdots,1/d)$ is the uniform distribution), then $f$ is a constant function and the claim is also trivially true. My intuition is that intermediate values of $\varepsilon$ interpolate between these extremes.

I'm conjecturing that the Lipschitz constant is 1, but I would be mostly satisfied if this conjecture can be proved for a larger Lipschitz constant.

1

There are 1 best solutions below

0
On BEST ANSWER

The conjecture does not hold. Here is a counterexample (found numerically):

$p=(0.34545229, 0.14548291, 0.5090648),p'=(0.35411664, 0.02645345, 0.61942991),\epsilon=.05$.

Python code for the simulation is below:

import cvxpy as cp
import numpy as np
from scipy.spatial.distance import cdist

def F(p,eps=.05):
    n=len(p)
    x = cp.Variable(n)
    ent=cp.sum(cp.entr(x))
    constraints=[x>=0,cp.sum(x)==1,cp.sum(cp.abs(x-p)**2)<=eps**2]
    prob = cp.Problem(cp.Maximize(ent),constraints)
    prob.solve()
    return x.value
#randomly sampled distributions on simplex 
X=[]
#corresponding value of f(p) for each distribution p
Y=[]
for _ in range(1000):
    p=np.random.rand(3)
    p=p/np.sum(p)
    X.append(p)
    Y.append(F(p))
X=np.array(X)
Y=np.array(Y)
#the fraction of pairs for which the bound is violated by more than .001
np.mean(cdist(X,X)+10**-3<cdist(Y,Y))

In this simulation,about .3% of pairs violated the conjectured inequality by more than .001.