Is it true that $D_{KL}[q_1 \| p]\leq D_{KL}[q_2 \| p] \Rightarrow D_{KL}[p\|q_1]\leq D_{KL}[p \| q_2]$?

61 Views Asked by At

The KL divergence between two probability measures $p, q$ on a Polish space $E$ is defined as

$$D_{KL}[q \| p] = \int_E \log \left(\frac{dq}{dp}\right)dq$$

provided that $q \ll p$.

Now is it true that $D_{KL}[q_1 \| p]\leq D_{KL}[q_2 \| p] \Rightarrow D_{KL}[p\|q_1]\leq D_{KL}[p \| q_2]$ provided that $p,q_1,q_2$ are equivalent in the sense of measures?

1

There are 1 best solutions below

0
On BEST ANSWER

The answer is no.

I'll explicitly construct a triple $(p,q_1,q_2)$ on four points that violates the relation. This can be extended to however rich a Polish space you like (as long as it has at least $4$ points) by convolving with appropriately thin densities.


Let $$p = \frac{(1,1,1,1)}{4}\\ q_1^x = \frac{(2-x, 2-x, x,x)}{4}\\ q_2 = \frac{(3, 1, 1, 1)}{6},$$ where $x \in (0,2)$ will be chosen in the following.

To construct a counterexample, we need to find an $x$ such that $D(q_1^x \|p) < D(q_2\|p)$ and $D(p\|q_1^x ) > D(p\|q_2)$. By direct computation, this boils down to finding an $x$ such that \begin{align} (2-x) \log(2-x) + x \log x &< \log(4/3)\\ 2\log(2-x) + 2\log(x) &< \log(16/27) \end{align}

The easiest way to find an appropriate $x$ is to simply use a CAS. Wolfram alpha suggests that $x = 1/2$ is good enough. Indeed, $$ (3/2) \log(3/2) +(1/2) \log(1/2) = \log(\sqrt{27}/4) \le \log(5.22/4) = \log(1.305) < \log(4/3)\\ 2\log(3/2) + 2\log(1/2) = \log(9/16) < \log(16/27),$$ where the first line uses that $\sqrt{3} < 1.74,$ and the second line follows since $9 \times 27 = 243 < 256 = 16^2$.


Process for finding the above - I started with $p$ being a uniform law because that makes $D(q\|p)$ and $D(p\|q)$ both have simple forms. $D(q_1\|p) < D(q_2\|p)$ is then equivalent to $H(q_1)>H(q_2),$ which led me to think about $q_1$ that is a little more spread out than $q_2$ - to simplify I chose $q_1$ peaked on half the space and $q_2$ on a quarter. $D(p\|q_1) > D(p\|q_2)$ is equivalent to $\sum \log(1/q_1(i)) > \sum \log(1/q_2(i))$. At this point I realised that I might as well take just $4$ points, and ran a CAS for $q_1$ and $q_2$ parametrised. Finally I just set the parameter of $q_2$ somewhere in the middle to make calculations easy.