Convexity of KL divergence for channel setting

24 Views Asked by At

I have seen an excellent proof of what I am trying to learn in Cover and Thomas or here but I am trying to clear my understanding if I try to prove it another way.

Given a probability distribution $P_X$ and a conditional probability distribution $P_{Y|X}$, let us examine

$$I(X:Y) := D(P_{XY}\|P_X\cdot P_Y)$$

and show that this is concave in $P_X$ for some fixed $P_{Y|X}$. We have

\begin{align} D(P_{XY}\|P_X\cdot P_Y) &= \sum_x\sum_y P_{XY}(x,y)\log P_{XY}(x,y) - \sum_x\sum_y P_{XY}(x,y)\log (P_{X}(x)P_Y(y)) \\ &= \sum_x\sum_y P_{X}(x)P_{Y|X}(y|x)\log (P_{X}(x)P_{Y|X}(y|x)) - \sum_x\sum_y P_{X}(x)P_{Y|X}(y|x)\log (P_{X}(x)P_Y(y)) \\ &=\sum_x\sum_y P_{X}(x)P_{Y|X}(y|x)\log (P_{X}(x)P_{Y|X}(y|x)) - \sum_x\sum_y P_{X}(x)P_{Y|X}(y|x)\log \left(P_{X}(x)\sum_{x'}P_X(x')P_{Y|X}(y|x')\right) \end{align}

The last step is what I'm unsure of since I have to express $P_Y(y)$ in terms of $P_X$ and $P_{Y|X}$ but to do so, it seems like I have to introduce the primed variable too. Now, I am unsure how to evaluate the convexity of terms like $P_X(x)\log P_X(x')$.

My questions are

  1. Is the last equality I wrote correct?

  2. How should one proceed from the second equality to show convexity in $P_X$, since $P_Y$ is a function of $P_X$?