Concativity of entropy without Jensen's inequality

Question

Concativity of entropy without Jensen's inequality

127 Views Asked by Bumbble Comm At 25 Mar 2026 - 7:03

In my information theory class I need to prove that entropy is concave (which is usually done with Jensen's inequality). But I want to use only the definition of entropy. And as the result of derivations I get a wrong answer.

Here what I do:

I need to prove that:

$$\lambda {\rm H} \left(p\right)+\left(1-\lambda \right){\rm H} \left(q\right)\le {\rm H} \left(\lambda p+\left(1-\lambda \right)q\right)$$

I use the definitions of entropy (with the summation carried out over $\textit{p}$ or $\textit{q}$, or ($\textit{p}$ and $\textit{q}$)):

$$\lambda {\rm H} \left(p\right)={\rm {\mathbb E}}_{p} \log \frac{1}{p^{\lambda } } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda }}$$

$$\left(1-\lambda \right){\rm H} \left(q\right)={\rm {\mathbb E}}_{q} \log \frac{1}{q^{\left(1-\lambda \right)} } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } $$

$${\rm H} \left(\lambda p+\left(1-\lambda \right)q\right)={\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q} $$

Then collecting everything and moving to the left-hand side:

$${\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } } +{\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } ={\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } q^{\left(1-\lambda \right)} } $$

$${\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } q^{\left(1-\lambda \right)} } -{\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q} \le 0$$

Using log properties:

$${\rm {\mathbb E}}_{p,q} \log \frac{\lambda p+\left(1-\lambda \right)q}{p^{\lambda } q^{\left(1-\lambda \right)} } \le 0$$

So I have to prove that the ration under log is less than unity (for log to be negative):

$$\frac{\lambda p+\left(1-\lambda \right)q}{p^{\lambda } q^{\left(1-\lambda \right)} } \le 1$$

Finally I get the inequality which is definitely wrong (since AM-GM says me so):

$$\lambda p+\left(1-\lambda \right)q\le p^{\lambda } q^{\left(1-\lambda \right)} $$

So, where I am wrong?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2020-04-16 14:52:39

This is just answering what went wrong (not using your chain of ideas for indeed proving concavity of entropy).

You have abandoned the expectation probabilities. Correctly, ${\rm {\mathbb E}}_{p} F(p) = \sum_{x_i} p(x_i) F(p(x_i))$.

1) As a first example, let $\lambda = \frac12$, $p=1$, $q=0^+$. Then, applying your ideas (without expectation probabilities) to the very first line, you get

$$ \lambda {\rm H} \left(p\right)+\left(1-\lambda \right){\rm H} \left(q\right)\le {\rm H} \left(\lambda p+\left(1-\lambda \right)q\right) \\ \leftrightarrow\\ \frac12 \log \frac{1}{1} + \frac12 \log \frac{1}{0^+}\le \log \frac{1}{\frac12 \cdot 1 + \frac12 \cdot 0 } = \log 2 $$ which is of course wrong. The correct treatment, including the expectation probabilities, is $$ \frac12 \cdot 1 \cdot \log \frac{1}{1} + \frac12\cdot 0^+ \cdot \log \frac{1}{0^+}\le \frac12\log 2 $$ and this is correct, since $0^+ \cdot \log \frac{1}{0^+} \to 0^+ $.

2) In a general setting which you want to pursue, putting the expectation probabilities in, you have (where for short, ${\rm {\mathbb E}} f(p) = \sum_{x_i} f(p(x_i))$ is used), $$ {\rm {\mathbb E}}_{p,q} \log \frac{1}{p^{\lambda } } +{\rm {\mathbb E}}_{p,q} \log \frac{1}{q^{\left(1-\lambda \right)} } = {\rm {\mathbb E}}\left[ p \log \frac{1}{p^{\lambda } } + q \log \frac{1}{q^{\left(1-\lambda \right)} }\right] $$ and, somewhat more involved, $$ {\rm {\mathbb E}}_{p,q} \log \frac{1}{\lambda p+\left(1-\lambda \right)q}={\rm {\mathbb E}}\left[({\lambda p+\left(1-\lambda \right)q})\log \frac{1}{\lambda p+\left(1-\lambda \right)q}\right] $$ which leaves you to show $$ p \log \frac{1}{p^{\lambda } } + q \log \frac{1}{q^{\left(1-\lambda \right)} }\le ({\lambda p+\left(1-\lambda \right)q})\log \frac{1}{\lambda p+\left(1-\lambda \right)q} $$ Continuing as you have laid it out would result in having to show $$ \frac{({\lambda p+\left(1-\lambda \right)q})^{({\lambda p+\left(1-\lambda \right)q})}}{p^{\lambda p} \cdot q^{\left(1-\lambda \right) q}} \le 1 $$ but I do not think this will lead us somewhere.

3) As a third piece of explanation which shows the effects of the above error versus the correct treatment, just notice that $\log (1/x)$ is convex whereas $x \log (1/x)$ is concave ... this says it all.

Concativity of entropy without Jensen's inequality

There are 1 best solutions below

Related Questions in ANALYSIS

Related Questions in INEQUALITY

Related Questions in CONVEX-ANALYSIS

Related Questions in ENTROPY

Related Questions in JENSEN-INEQUALITY

Trending Questions

Popular # Hahtags

Popular Questions