What is the proof of entropy property: $H\left(x,y\right)\le H\left(x\right)+H\left(y\right)$ in Shannon's paper?

309 Views Asked by At

In Shannon's 1948 paper titled "A Mathematical Theory of Communication", in the discussion of the entropy of the joint event, there is no proof for this inequality (or subadditivity of entropy) $$H\left(x,y\right) \le H\left(x\right)+H\left(y\right)$$ in the paper,

however, I found one proof based on Jensen's inequality from this link https://www.cs.princeton.edu/courses/archive/fall11/cos597D/L01.pdf, but I am confused about which convex function is used in this proof, it is said that the convex function is $$\log{\frac{1}{x}}$$ the below is the proof:

$$H\left(x,y\right)-H\left(x\right)-H\left(y\right)$$ $$=\left(-\sum_{i,j}{p\left(i,j\right)\log{p\left(i,j\right)}}\right)-\left(-\sum_{i,j}{p\left(i,j\right)\log{\sum_{j}\ p\left(i,j\right)}}\right)-\left(-\sum_{i,j}{p\left(i,j\right)\log{\sum_{i}\ p\left(i,j\right)}}\right)$$ $$=\left(-\sum_{i,j}{p\left(i,j\right)\log{p\left(i,j\right)}}\right)-\left(-\sum_{i,j}{p\left(i,j\right)\log{p\left(i\right)}}\right)-\left(-\sum_{i,j}{p\left(i,j\right)\log{p\left(j\right)}}\right)$$ $$=\sum_{i,j}\left(p\left(i,j\right)\left(-\log{p\left(i,j\right)+\log{p\left(i\right)+\log{p\left(j\right)}}}\right)\right)$$ $$=\sum_{i,j}\left(p\left(i,j\right)\left(\log{\frac{p\left(i\right)p\left(j\right)}{p\left(i,j\right)}}\right)\right)$$

Here comes the question: $\log{\frac{1}{x}}$ is a convex function, Jensen's inequality is applied, $\log{x}$ is a concave function, the reverse of Jensen's inequality is applied. How to figure out that the function of $$\log{\frac{p\left(i\right)p\left(j\right)}{p\left(i,j\right)}}$$ is the function of $$\log{\frac{1}{x}}$$ or $$\log{x}$$?

If let $x=\frac{p(i)p(j)}{p(i,j)}$, then the function of $\log{\frac{p(i)p(j)}{p(i,j)}}$ would be function $\log{x}$

$$\sum_{i,j}\left(p\left(i,j\right)\left(\log{\frac{p\left(i\right)p\left(j\right)}{p\left(i,j\right)}}\right)\right)\le\log{\left(\sum_{i,j}\frac{p\left(i,j\right)p\left(i\right)p\left(j\right)}{p\left(i,j\right)}\right)\ }$$ $$=\log{\left(\sum_{i,j}\ p\left(i\right)p\left(j\right)\right)}$$ $$=\log{1}=0$$

Then subadditivity of entropy $$H(X,Y)\le H(X)+H(Y)$$ is proven.

If let $x=\frac{p(i,j)}{p(i)p(j)}$ (rather than $x=\frac{p(i)p(j)}{p(i,j)}$), the function of $\log{\frac{p(i)p(j)}{p(i,j)}}$ would be function $\log{\frac{1}{x}}$ then the proof will become

$$\sum_{i,j}\left(p\left(i,j\right)\left(\log{\frac{p\left(i\right)p\left(j\right)}{p\left(i,j\right)}}\right)\right)$$ $$=\sum_{i,j}\left(p\left(i,j\right)\left(\log{\frac{1}{\left(\frac{p\left(i,j\right)}{p\left(i\right)p\left(j\right)}\right)}}\right)\right)\geq\log{\left(\frac{1}{\sum_{i,j}\left(p\left(i,j\right)\cdot\frac{p\left(i,j\right)}{p\left(i\right)p\left(j\right)}\right)}\right)}$$

then we just get $$H(X,Y)-H(X)-H(Y)\ge\log{\left(\frac{1}{\sum_{i,j}\left(p\left(i,j\right)\cdot\frac{p\left(i,j\right)}{p\left(i\right)p\left(j\right)}\right)}\right)}$$

Then subadditivity of entropy $$H(X,Y)\le H(X)+H(Y)$$ cannot be proven yet.