Derivation of Jensens Inequality with Taylor Series

1.6k Views Asked by At

In the proof of Jensen's inequality in a probabilistic setting, the book gives the following demonstration:
Expand the Taylor series of $f(x)$ around $\mu =\mathbb {E}[X]$. $$f(x)=f(\mu)+f'(\mu)(x-\mu)+\frac {f''(\epsilon)(x-\mu)^2}{2}$$ For $\epsilon$ between $x$ and $\mu$. Since $f$ is convex, $f''(\epsilon)\ge 0$ therefore $f(X)\ge f(\mu)+f'(\mu)(X-\mu)$. Taking expectations of both sides gives $$\mathbb{E}[f(X)]\ge f(\mu)$$ The part I do not understand is how they truncate the Taylor series to only three terms and are still able to say that the truncated expression equals $f(x)$. In my head, this should only be an approximation.
I suppose one must give $\epsilon$ as a function of $x$ but the justification if this step still eludes me, and if there are terms missing in the expansion, then the inequality does not necessarily follow.
How does justify the step of truncating the Taylor series to only three terms while still maintaining equality?

2

There are 2 best solutions below

0
On BEST ANSWER

In general, if we have an analytic function $f$, we can write out the Taylor series for $f$ around a point $x$ by $$f(y)=\sum_{i=0}^\infty f^{(i)}(x)\frac{(y-x)^i}{i!}$$ and we'd have equality for all $y$ in some neighborhood of $x$. In general, when we don't have an analytic function, and say we just have a twice differentiable function (as in your case), the series above doesn't even make sense since we don't necessarily have $f^{(i)}(x)$ defined for $i=3,4,5...$ However it's still always true that for every $y$ is some neighborhood of $x$,we can find $z$ in between $x$ and $y$ (i.e. if $x\leq y$ then $x\leq z\leq y$ and if $x\geq y$ then $x\geq z\geq y$) such that $$f(y)=f(x)+f'(x)(y-x)+\frac{f''(z)}{2}(y-x)^2.$$ The equality is exact, but just for our choice of $y$. We'd (maybe) have a different $z$ for a different choice of $y$ near $x$. In the proof of Jensen's inequality, the fact that $z$ may vary doesn't pose a problem because we just use the fact that $f''$ is nonnegative.

2
On

Notice that the last term is "for $\epsilon$ between $x$ and $\mu$." That is the error term, so it isn't an approximation. It is exact.

The error term is given by the mean value theorem.