proof verification: Jensen's inequality

86 Views Asked by At

Following is the Jensen's inequality in Rudin's "Real and Complex Analysis".(Thm 3.3)

Let $\mu$ be a positive measure on a $\sigma-$algebra $\mathfrak M$ in a set $\Omega$, where $\mu(\Omega)=1$. Let $f$ be a function in $L^1 (\mu)$. Suppose $a<f(x)<b$ for all $x \in \Omega$, and $\phi$ is convex function on $(a, b)$. ($a, b$ might be $\pm \infty$) Prove that

\begin{equation} \phi ( \int_{\Omega}fd\mu ) \le \int_{\Omega}\phi \circ f d\mu \end{equation}

Although Rudin has wonderful proof, I want to verify my proof if it is correct. My proof is explained below.

We will make function $\{\phi_n\}$ such that $\phi_n \le \phi$ and $\phi_n \to \phi$ for all $(a, b)$. Let $[a=x_0 , x_1 , \cdots , x_n =b]$ be a partition such that $x_{i+1}-x_i < \frac{1}{n}$. Let $\psi_n$ be a function which passes through $$(x_1 , \phi(x_1)), (x_2, \phi(x_2)), \cdots , (x_{n-1}, \phi(x_{n-1}))$$ and consists of segments. (i.e., if $x_{i-1}<x<x_i$, $\phi(x)$ is determined by a segment joining $(x_{i-1}, \phi(x_{i-1})), (x_i, \phi(x_i))$) Also, let's extend the segment at the left(right) end so that we can define $\psi_n$ for all $(a, b)$. (i.e., if $a<x<x_1$, $\phi(x)$ is determined by a (extended) segment joining $(x_1, \phi(x_1)), (x_2, \phi(x_2))$.) $$$$ Intuitively, $\psi_n$ is placed slightly above $\phi$, except at the end. Now, using that continuity implies uniform continuity in compact segment, for any $n$, we can choose $m$ such that $|x-y|<\frac{1}{m}$ implies $|f(x)-f(y)|<\frac{1}{n}$ for $[a+\frac{1}{n}, b-\frac{1}{n}]$. Now let's define $\phi_n = \psi_m - \frac{1}{n}$. Then, $\phi_n$ is placed slightly below $\phi$, and it is not hard to prove that $\phi_n \le \phi$ (using $\phi$'s convexity) Also, by definition we have that $\phi_n$ converges uniformly to $\phi$ in any compact subset of $(a, b)$. Thus we have the desired properties.$$$$ We will now prove that Jensen's inequality holds for $\phi=\phi_n$. To prove this, we will decompose $\phi_n$ to sum of functions which has at most one discontinuity and linear at all the other points. (And without loss of generality, we can assume that one of the linear parts is constant.) Lets suppose $$\phi_n(x)=\begin{cases} k & a<x<c \\ k+m(x-c) & c<x<b \end{cases}$$ where $k, m$ is constant and $m>0$. If $a<\int_{\Omega} f d\mu \le c$, we have $$\phi_{n}(\int_{\Omega}fd\mu) = k = \int_{\Omega}kd\mu \le \int_{\Omega}\phi_n \circ fd\mu$$ since $k \le \phi_n$. If $c<\int_{\Omega} f d \mu <b$, we have \begin{align} \phi_n(\int_{\Omega}fd\mu) &= k+m( \int_{\Omega}fd\mu -c) \\ &= k+m( \int_{f^{-1}(a, c)}fd\mu + \int_{f^{-1}[c,b)}fd\mu -c) \\ &\le k+m( \int_{f^{-1}(a, c)}cd\mu + \int_{f^{-1}[c,b)}fd\mu -c)\\ &= \int_{f^{-1}(a, c)}kd\mu + \int_{f^{-1}[c,b)}kd\mu + m(-\int_{f^{-1}[c,b)}cd\mu + \int_{f^{-1}[c,b)}fd\mu) \\ &= \int_{f^{-1}(a, c)}kd\mu + \int_{f^{-1}[c,b)}(k+m(f-c))d\mu \\ &=\int_\Omega \phi_n \circ f d \mu \end{align} $$$$ Now we are done. Since $\phi_n \le \phi$, we have $$\phi_n(\int_{\Omega}fd\mu) \le \int_{\Omega}\phi_n \circ f d \mu \le \int_{\Omega} \phi \circ f d \mu$$. Since $\phi_n \to \phi$ and $a<\int_{\Omega}fd\mu <b$, tending $n$ to $\infty$ yields Jensen's inequality. $$$$ In the case $a=-\infty$ or $b=\infty$, we can think $-n$ or $n$ instead of $a+\frac{1}{n}$ or $b-\frac{1}{n}$ when making function $\psi_n$.

I admit that above proof is not that rigorous, but I hope that I wrote every idea and some details there. If you find any error or doubt about something, please let me know. Thank you.