Negative Entropy $x\log x$ is convex

4.6k Views Asked by At

Can someone tell me (or show me), where can I find a proof for the convexity of \[f:\mathbb{R}_{\geq0}\to\mathbb{R},\quad x\mapsto x\log x\] without using the first or second derivative trick?

We call a map convex if \[f(\lambda x+(1-\lambda)\tilde{x})\leq\lambda f(x)+(1-\lambda)f(\tilde{x})\quad\text{for}~\lambda\in(0,1)\]

4

There are 4 best solutions below

1
On

Answer to previous version of this question:

This is hair-splitting, but: the first derivative of your function is an increasing function, so your function is convex.

Hair splitting, because the first derivative trick is not the second derivative trick, and "increasing first derivative" is not exactly "non-negative second derivative", but yet implies convexity, too.

2
On

So I tried the proof, to prove that a function or set is convex we take two points $x$ and $y$ and we try proving that , $$f(\lambda x + (1-\lambda) y) \leq \lambda f(x) + (1-\lambda) f(y)$$ The concept is simple if we joing two points $x$ and $y$ by a line then all the points lying on the line should also lie inside the set, the line being $\lambda x + (1-\lambda) y$ , if we vary \lambda from $(0,1)$ we find at $\lambda$ = $0$ we get the point $x$ at $\lambda = 1$ we get the point $y$ and for the value of $\lambda = 0.5$ we get the midpoint of $x$ and $y$.

Now for the proof: $$f(x) = x \log x$$ so $$f(\lambda x + (1-\lambda) y) = ({\lambda x + (1-\lambda) y})\space \log ({\lambda x +(1-\lambda)y})$$

The RHS becomes $$\lambda x \log (\lambda x + (1-\lambda) y)+ (1-\lambda) y \log (\lambda x + (1-\lambda) y$$

Take the first part and you will notice $\lambda x \log(\lambda x + (1-\lambda)y) \geq \lambda x \log(\lambda x) $ since $\log(A+B) \geq \log A$

Again $$\lambda x \log \lambda x = \lambda x (\log \lambda + \log x) \geq \lambda x \log x$$ So this leads to $$\lambda x \log(\lambda x + (1-\lambda)y) \geq \lambda x \log x$$ Similarly, $$(1-\lambda) y \log(\lambda x + (1-\lambda)y) \geq (1-\lambda) y \log y$$

Adding these two equations you get, **$$\lambda x \log(\lambda x + (1-\lambda)y) + (1-\lambda) y \log(\lambda x + (1-\lambda)y) \geq \lambda x \log x + (1-\lambda) y \log y $$

This proves $$f(\lambda x + (1-\lambda) y) \geq \lambda f(x) + (1-\lambda) f(y)$$ meaning that $f(x) = xlogx$ is not convex meaning that $-f(x)=-x\log x $ is convex.

Hope this helps …….

0
On

That's what I figured out: Let $\lambda\in (0,1)$ and $x,y\in\mathbb{R}>0$. Then we get \begin{align*} f(\lambda x+(1-\lambda)y)&=[\lambda x+(1-\lambda)y]\log(\lambda x+(1-\lambda)y)\\ \\ &=\lambda x\log(\lambda x+(1-\lambda)y)+(1-\lambda)y\log(\lambda x+(1-\lambda)y). \end{align*} For the first term of RHS we get \[\lambda x\log(\lambda x+(1-\lambda)y)\leq\lambda x\log(x)+\lambda(1-\lambda)y\] and for the second term we get \[(1-\lambda)y\log(\lambda x+(1-\lambda)y)\leq (1-\lambda)y\log(y)+\lambda(1-\lambda)x\] In summary, this results in \[f(\lambda x+(1-\lambda)y)\leq \lambda f(x)+(1-\lambda)f(y)+\lambda(1-\lambda)(x+y).\] The last term is the problem now.

1
On

Not sure if it is still relevant, yet there is no accepted answer, so I'll give it a try.

To prove that $f(x)=x \log{x}$ is convex on $\mathbb{R}_{\geq 0}$, we should show that $$f(\lambda x_{1}+(1-\lambda))x_{2}) \leq \lambda f(x_{1}) + (1-\lambda) f(x_{2})$$

So we have $$ \begin{align} f(\lambda x_{1}+(1-\lambda))x_{2}) &=(\lambda x_{1} + (1-\lambda) x_{2}) \log{ (\lambda x_{1} + (1-\lambda) x_{2})} =\\ &=\lambda x_{1} \cdot \log{ (\lambda x_{1} + (1-\lambda) x_{2})} + (1-\lambda) x_{2} \cdot \log{ (\lambda x_{1} + (1-\lambda) x_{2})} \end{align}$$

Consider the term $\lambda x_{1} \log{ (\lambda x_{1} + (1-\lambda) x_{2})}=\lambda x_{1} \log{ \left(\lambda x_{1} \cdot \left\{1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right\}\right)}$

Using the logarithm product rule, we get that $$\lambda x_{1} \log{ \left(\lambda x_{1} \cdot \left\{1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right\}\right)}=\lambda x_{1} \log{\lambda x_{1}} + \lambda x_{1} \log{\left(1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right)}$$.

One can see that $\log{\left(1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right)} \geq 0$ since $x_{1}$ and $x_{2}$ are positive numbers and $\lambda \in (0,1)$ and also $\lambda x_{1} \geq 0$, so we can say that $\lambda x_{1} \log{\lambda x_{1}} + \lambda x_{1} \log{\left(1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right)} \leq \lambda x_{1} \log{\lambda x_{1}}$.

Since $\lambda \leq 1$, it follows that $\lambda x_{1} \leq x_{1}$ and also $\log{\lambda x_{1}} \leq \log{x_{1}}$.

Combining all the arguments, we see that $\lambda x_{1} \log{ \left(\lambda x_{1} \cdot \left\{1+ \frac{(1-\lambda) x_{2}}{\lambda x_{1}}\right\}\right)} \leq \lambda x_{1} \log{x_{1}}$.

Applying the same logic for the second term, it can be shown that $(1-\lambda) x_{2} \log{ \left((1-\lambda) x_{2} \cdot \left\{1+ \frac{\lambda x_{1}}{(1-\lambda) x_{2}}\right\}\right)} \leq (1-\lambda) x_{2} \log{x_{2}}$.

which means that
$$ (\lambda x_{1} + (1-\lambda) x_{2}) \log{ (\lambda x_{1} + (1-\lambda) x_{2})} \leq \lambda x_{1} \log{x_{1}} + (1-\lambda) x_{2} \log{x_{2}} $$

Thus, $f(x)=x\log{x}$ is convex.