Case of equality in Jensen's Inequality

42 Views Asked by At

I haven't really understood parts of the proof given for $\textrm{Problem 6.2}$ of Steele's Cauchy-Schwarz Master Class.

Suppose that $f : [a, b] → \mathbb{R} $ is strictly convex and show that if

\begin{equation} f\left(\sum^{n}_{j=1}p_jx_j\right)=\sum_{j=1}^{n}p_jf(x_j)\end{equation} where the positive reals $p_j$, $j = 1,2,\ldots ,n$ have sum $p_1+p_2+\dots+p_n = 1$, then one must have \begin{align} x_1 =x_2 =\dots=x_n \end{align}

He goes on to form the argument that if $x_1 =x_2 =\dots=x_n$ doesn't hold then the set $\displaystyle S:=\{j:x_j\neq \max_{1\leq k \leq n}\ x_k\}$ is a proper subset of $\{ 1,2,\dots ,n \}$ and goes on to show that this leads to a contradiction.

He first set \begin{equation} p=\sum_{j \in S}p_j,\quad x=\sum_{j\in S} \frac{p_j}{p} x_j,\quad \text{and}\quad y=\sum_{j\notin S}\frac{p_j}{1-p}x_j\end{equation}

So from the definition of strict convexity \begin{equation} f\left( \sum_{j=1}^{n} p_jx_j\right)=f\left(\left( \sum_{j\in S}p_j +\sum_{j \notin S} p_j\right)x_j\right)=f(px+(1-p)y)<pf(x)+(1-p)f(y)\end{equation}

Up to this point, do we use the fact that $p=\sum_{j\in S}p_j$ ? I think you can get this by just setting $x$ and $y$ to be what they were set to be. But setting $p$ seems unnecessary to me throughout the whole exercise. Then he writes and I quote

"Moreover, by the plain vanilla convexity of $f$ applied separately at $x$ and $y$, we also have the inequality \begin{equation}pf(x)+(1-p)f(y)\leq p\sum_{j\in S}\frac{p_j}{p}f(x_j)+(1-p)\sum_{j\notin S} \frac{p_j}{1-p} x_j =\sum_{j=1}^{n} p_jf(x_j)\end{equation}"

Where I have a problem here is that I don't see how this is the vanilla convexity definition. Does he maybe mean Jensen's inequality by vanilla definition? Because I'd do it by Jensen's $$f\left(\sum_{j\in S}\frac{p_j}{p}x_j\right)\leq \sum_{j\in S}\frac{p_j}{p}f(x_j),\quad f\left(\sum_{j\notin S}\frac{p_j}{1-p}x_j\right)\leq \sum_{j\notin S}\frac{p_j}{1-p}f(x_j)$$

And from what I can tell he still doesn't use $\displaystyle p=\sum_{j\in S} p_j$.

The first scenario is that the steps I mentioned are fine and Steele chose to go a different way but the more probable scenario in my opinion is that I missed something.