Question on proof of Jensen's inequality - $E[\phi(X)] \ge \phi(E[X])$, where $\phi$ is convex

144 Views Asked by At

In Steven Schreve's Stochastic Calculus for Finance book on page 30, Schreve proves the Jensen's inequality. However, I don't quite understand all of the steps in the proof.

(P.S. Alongside, this book, I am reading Feller, so it's my first serious exposure to probability).

If $\phi(x)$ is a convex function in the dummy variable $x$, then

$$\mathbb{E}(\phi(X))\ge \phi(\mathbb{E}(X))$$

Proof.

We first argue that a convex function is the maximum of all linear functions that lie below it. That is, for all $x \in \mathbb{R}$,

$$\phi(x) =\max\{l(x)| l \text{ is linear and }l(y)\le \phi(y),\forall y\in \mathbb{R}\}$$

(We first prove that $\phi(x)$ is the upper bound for any linear function below it. That is amply clear to me.)

Since, we are considering only linear functions $l(x)$ that lie below $\phi$, it is clear that:

$$\phi(x) \ge \max\{l(x)| l \text{ is linear and }l(y)\le \phi(y),\forall y\in \mathbb{R}\}$$

On the other hand, let $x$ be an arbitrary point in $\mathbb{R}$. Because $\phi$ is convex, there is always a linear function that lies below $\phi$ for which $\phi(x)=l(x)$ for this particular $x$. This is called the support line of $\phi$ at $x$.

Therefore,

$$\phi(x) \le \max\{l(x)| l \text{ is linear and }l(y)\le \phi(y),\forall y\in \mathbb{R}\}$$

This establishes the equality (1).

How does the less than $<$ sign come about? Why would $\phi(x)$ is less than the maximum of all linear functions lying below $\phi(y)$ for all $y \in \mathbb{R}$?