A generalisation of a well known result in information theory

206 Views Asked by At

It is well known that Entropy is additive, and that it is the only sensible choice for measuring uncertainty if we want additivity to hold, i.e.

$H(XY) = H(X)+H(Y)$

or more explicitly, if we have the constraint that

$\int_\chi\int_\chi p_1(x)p_2(x) f(p_1(x)p_2(x)) dx^2 = \int_\chi p_1(x) f(p_1(x)) dx + \int_\chi p_2(x) f(p_2(x))dx$

with $\int_\chi p_1 dx = \int_\chi p_2 dx = 1$. The only choice is $f(u) = k \log(u)$.

There is a similar problem concerning f-divergences, where we can introduce an additivity constraint:

$\int_\chi\int_\chi p_1(x)p_2(x) f\left(\frac{p_1(x)p_2(x)}{q_1(x)q_2(x)}\right) dx^2 = \int_\chi p_1(x) f\left(\frac{p_1(x)}{q_1(x)}\right) dx + \int_\chi p_2(x) f\left(\frac{p_2(x)}{q_2(x)}\right)dx$

I am pretty sure that the only valid functions $f$ are

$f(u) = (A/u+B)\log(u)$

but I do not know how to show that (or even if) this exhausts all the possibilities.

EDIT:

Corrected u to 1/u, to fit the definition of $u = p/q$ (usually it is $q/p$).

How I get to $(\frac{A}{u}+B)\log(u)$

First of all, if $f(u)$ and $g(u)$ are solutions, so is their linear combination: $a f(u) + b g(u)$. I can find two solutions that are not linearly related

1) $f(u) = k_1 \log u$

2) $f(u) = k_2 \frac{\log u}{u}$ (this effectively switches p and q)

these can be shown to work, making all the solutions of the form I proposed valid as they are linear combinations of (1) and (2). The problem I have is that I am not sure that there are not other functions that are not linearly related to either (1) or (2).

1

There are 1 best solutions below

3
On

Edited: With due permission I am completely rewriting the proof.

Let $$D_f(P|Q)=\sum_i p_if(q_i/p_i)$$ where $f$ is a strictly convex function such that $f(1)=0$, satisfy additivity, i.e., $D_f(P_1\star P_2|Q_1\star Q_2)=D_f(P_1|Q_1)+D_f(P_2|Q_2)$ with $D_f((1,0)|(1/2,1/2))=1$, where $P\star Q$ is the product distribution, i.e., $(p_iq_j)_{i,j}$.

This proof is just a by product of Renyi's work.

Let $ P_1=P_2=(1,0),Q_1=(q_1,1-q_1), Q_2=(q_2,1-q_2)$.

Then $D_f(P_1\star P_2|Q_1\star Q_2)=f(q_1q_2)$, $D_f(P_1|Q_1)=f(q_1)$, and $D_f(P_2|Q_2)=f(q_2)$

So by additivity, $f(q_1q_2)=f(q_1)+f(q_2)$. And from $D_f((1,0)|(1/2,1/2))=1$, $f(1/2)=1$. Also from Jensen's inequality $D_f(P|Q)\ge f(1)=0$, hence $f$ is non negative.

The only $f$ satisfying the above three properties is $f(q)=-\log q$ and we are done.