Proving some inequalities related to Information Theory

55 Views Asked by At

I've been working on some inequalities related to the information theory section of my decision theory course, and I could use some help on some of the derivations for one of the inequalities.

As a reminder, $$H(P) = -\sum_{i=1}^n p_i \log(p_i)$$ $$H(P,Q) = -\sum_{i=1}^n \sum_{j=1}^m \mathbb{P}(P = i, Q = j) \log(\mathbb{P}(P = i, Q = j)).$$ I want to prove that $$\max\{H(P),H(Q)\} \leq H(P,Q) \leq H(P) + H(Q).$$ The lower bound is quite easy: given a particular lemma that states $H(P,Q) = H(P|Q) + H(Q) = H(Q|P) + H(P).$ The upper bound is quite tricky, since it comes down to proving $$H(P,Q) \leq H(P) + H(Q)$$ $$\leftarrow H(P|Q) + H(Q) \leq H(P) + H(Q)$$ $$\leftarrow H(P|Q) \leq H(P)$$ $$\leftarrow 0 \leq H(P) - H(P|Q),$$ I.e. we need to show the information gain of $P$ given $Q$ is always non-negative. Any suggestions on how to approach a problem like this?

1

There are 1 best solutions below

0
On

Essentially, Jensen's inequality.

\begin{align} H(P) - H(P \mid Q) &= - \sum_i p(i) \log p(i) + \sum_j p(j) \sum_i p(i \mid j) \log p(i \mid j)\\ &= - \sum_{i,j} p(i,j) \log p(i) + \sum_{i,j} p(i,j) \log p(i \mid j)\\ &= -\sum_{i,j} p(i,j) \log \frac{p(i)p(j)}{p(i,j)}\\ &\ge -\log \sum_{i,j} p(i,j) \frac{p(i)p(j)}{p(i,j)}\\ &= -\log 1\\ &= 0. \end{align}