Shannon's proof of entropy power inequality

217 Views Asked by At

I am reading through Shannon's proof of entropy power inequality (Appendix 6 in his paper A mathematical theory of communication). The theorem is that $$ e^{2H(X)} + e^{2H(Y)} \leq e^{2H(X+Y)} $$ where $H(X) = - \int p(x)\ln p(x) dx$ and similarly for $H(Y)$ and $H(X+Y)$.

It seems to me that Shannon used the Lagrange multipliers and the method of variation to derive the requirements to minimize $H(X+Y) = -\int r(x)\ln r(x) dx$ $(r=p\star q)$ with respect to the probability distributions $p(x)$ and $q(y)$ under the constraints that their entropies $H(X)$ and $H(Y)$ are fixed. And then he verified Gaussian distributions satisfy these requirements.

However, I could not follow the details in his proof. For example, based on this equation $$ \int (1+\ln r(x_i)) \delta r(x_i) + \lambda_1 (1+\ln p(x_i)) \delta p(x_i) + \lambda_2 (1+\ln q(x_i)) \delta q(x_i) dx_i = 0 $$ He concluded that when $p(x)$ is varied at a particular argument $x_i = s_i$, we have, $$ \int q(x_i-s_i)\ln r(x_i) dx_i = -\lambda_1 \ln p(s_i). $$ However, it seems to me that the correct formula should be $$ \int q(x_i-s_i)\ln r(x_i) dx_i + 1 = -\lambda_1 \ln p(s_i) -\lambda_1. $$ Also, in the end of his proof, he claims that when $A_{ij} = \frac{H_1}{H_2} B_{ij}$, the requirements are satisfied and I don't understand why that is true. Could anyone help me with that?