How do I maximize entropy?

3.1k Views Asked by At

In the book on probability I am reading, I am asked to prove that the entropy of $X$ is maximized when $X$ is uniformly distributed.
At first I came up empty and decided to check online. Most proofs made use of the AM-GM inequality which the book did not cover, so I was wondering if I could come up with a proof that relied on only things in the book.
I try using the fact that $$\begin{align} \ln x \le x-1 <x&\Rightarrow -x\ln x > -x^2 \\ &\Rightarrow -\sum_i p_i\ln p_i >- \sum_i p_i^2 \\ &\Rightarrow H(X) >- \sum_i p_i^2 \end{align}$$ To maximize $H(x)$ we should maximize $F(\mathbf {p} )=-\sum_i p_i^2$ subject to the constraint $g(\mathbf {p} )=\sum_i p_i=1$. Using the method of Lagrange multipliers, we get that $p_i=p_j \quad \forall i\ne j$.
I would like to know if this argument is correct or if there are easier ways to prove the result given that the book assumes knowledge of differential equations, multivariable calculus and linear algebra.
I was also wondering if I could apply the method of Lagrange Multipliers to the entropy itself.

3

There are 3 best solutions below

2
On BEST ANSWER

One approach:

  • Say your probability distribution takes values in $\{1,\cdots,n\}$.
  • Then $H(X) = \mathbb{E}[\log\frac{1}{p(X)}] \le \log \mathbb{E} \left[ \frac{1}{p(X)} \right] = \log n$, by Jensen's inequality applied to the concave function $f(x)=\log x$.
  • Equality holds when $p(X)$ is constant, i.e. when $X$ is uniformly distributed.
5
On

I think the proof using Jensen's inequality is nicer, but let's do the proof using the LM method just for clarity. Consider the discrete distribution case. Let us maximize $-\displaystyle\sum_{i}\log(p_i)p_i$ subject to $\displaystyle\sum_{i}p_i=1$. The first order condition is

$$ -1-\log(p_i)-\lambda=0$$

i.e. $\log(p_i)=-1-\lambda$. Moreover, by twice differentiating this expression you can check that the sufficient conditions for the LM method holds.

Hence $p_i=p_j$ for all $i$ and $j$. Moreover, the probabilities are positive. Therefore $p_i=\frac{1}{N}$, where $N$ is the number of states.

0
On

Assume the convex function $f(p_i)=p_i\log p_i$.

Consider the Jensen's inequality: $$f(\mathsf E p(x))\le \mathsf E f(p(x))$$

That is $$\left(\frac{1}{n}\sum_{i=1}^n p_i\right)\log\left(\frac{1}{n}\sum_{i=1}^n p_i\right)\le\frac{1}{n}\sum_{i=1}^n p_i\log p_i$$

Use the fact that $\sum_{i=1}^n p_i=1$ and then multiply the two sides by $-n$:

$$H\le -n\left(\frac{1}{n}\right)\log\left(\frac{1}{n}\right)=\log n$$ Now the maximum entropy $H=\log n$ is achieved when $p_1=p_2=\cdots=p_n=\frac{1}{n}$, according to the equality rule of the Jensen's inequality.