$f$ convex $\iff$ $f(y) \ge f(x) + \nabla^Tf(x)(y-x)$ and $f$ is convex $\iff$ $\nabla^2f(x)\ge 0$

2.7k Views Asked by At

I need to prove $2$ things:

1) Let $f\in C^1$. Then $f$ is convex em a convex $S$ $\iff$ for all $x,y\in S$ we have

$$f(y) \ge f(x) + \nabla^Tf(x)(y-x)$$

2) Let $f\in C^2$. Let $S\subset\mathbb{R}^n$ be a convex such that $S^0$ (actually an S with a $0$ on top of it, what is it?). Then $f$ is convex $\iff$ $\nabla^2f(x)\ge 0$ for all $x\in S$

(I've seen Prove this basic inequality $f(y)\geq f(x)+\nabla f(x)^T(y-x)$for differentiable convex functions but since these $2$ propositions are connected I think it has more sense to do a proof a little different)

For (1), I noticed that if we take the Taylor Expansion:

$$f(x+p) = f(x) + p^T\nabla^Tf(x) + \frac{1}{2}p^t\nabla^2f(x+tp)p$$

for some $t\in (0,1)$

and do $p = y-x$ we end up with

$$f(y) = f(x) + (y-x)^T\nabla^Tf(x) + \frac{1}{2}(y-x)^t\nabla^2f(x+t(y-x))(y-x)$$

If we want to prove that $f(y) \ge f(x) + \nabla^Tf(x)(y-x)$, then using the taylor expansion above, it's just a matter of proving $\frac{1}{2}(y-x)^t\nabla^2f(x+t(y-x))\ge 0$. But there are no assumptions about the second derivative of $f$.

For (2), if we use the convex condition of $1$, then using the second order taylor expansion again:

$$f(y) = f(x) + (y-x)^T\nabla^Tf(x) + \frac{1}{2}(y-x)^t\nabla^2f(x+t(y-x))(y-x)$$

Now if we suppose $\nabla^2 f(x)\ge 0$, we get that $f$ is convex because we get the inequality of the first exercise. But how do we know for sure that $\nabla^2f(x)\ge 0$ for all $x\in S$ $\implies \frac{1}{2}(y-x)^t\nabla^2f(x+t(y-x))(y-x)$?

For the converse, if $f$ is convex then

$$f(y) \ge f(x) + \nabla^Tf(x)(y-x)$$

If $\nabla^2f(x)$ wereto be $<0$, we'd have the second order term $\frac{1}{2}(y-x)^t\nabla^2f(x+t(y-x))(y-x)$ in a taylor expansion to be negative (how to prove this?), then by the taylor expansion we'd have $f(y) < f(x) + \nabla^Tf(x)(y-x)$ which is a contradiction.

1

There are 1 best solutions below

8
On BEST ANSWER

Part 1 "$\implies$"

$f$ is convex iff $$f(ty+(1-t)x)\le tf(y)+(1-t)f(x), \forall x,y, \forall t\in [0,1].$$

Thus, if $f$ is convex then $$g(t)=f(ty+(1-t)x)-tf(y)-(1-t)f(x)\le 0, \forall t\in [0,1].$$

So

$$g'(0)=\lim_{t\to 0^+}\frac{g(t)-g(0)}{t}\le 0.$$ That is

$$\nabla^T f(x)\cdot (y-x)-f(y)+f(x)\le 0$$ or, equivalently, $$f(y)\ge \nabla^T f(x)\cdot (y-x)+f(x)$$

Part 1 "$\impliedby$"

Now, consider $z=tx+(1-t)y.$ We have that

$$f(y)\ge \nabla^T f(z)\cdot (y-z)+f(z)=t\nabla^T f(z)\cdot (y-x)+f(tx+(1-t)y)$$

and

$$f(x)\ge \nabla^T f(z)\cdot (x-z)+f(z)=(1-t)\nabla^T f(z)\cdot (x-y)+f(tx+(1-t)y).$$ Multiplying the first inequality by $1-t,$ the second one by $t$ and adding them we get

$$tf(y)+(1-t)f(x)\ge f(ty+(1-t)x),$$ which shows that $f$ is convex.

Part 2 "$\implies$"

A matrix $A$ is said to be positive semidefinite if $v^TAv\ge 0, \forall v.$ In this case the notation $\nabla^2 f\ge 0$ means that $\nabla^2f$ is positive semidefinite and thus $$(y-x)^t\nabla^2f(x+t(y-x))(y-x)\ge 0.$$

Part 2 "$\impliedby$" (improved/corrected following the comments by @CalvinKhor)

Finally, assume that $\nabla^2 f\ge 0$ doesn't hold. Thus there exists a vector $v$ such that $$v^t \nabla^2 f(z) v<0.$$

Because of the continuity of the Hessian we have that $$v^t \nabla^2 f(z') v<0$$ if $z'$ satisfies $|z-z'|<\epsilon$ for some $\epsilon>0.$ Now considering $x=z,y=z+\epsilon v$ and $t\in [0,1]$. It follows from Taylor that

$$f(z+\epsilon v) = f(z) + \nabla f(z) \cdot (\epsilon v) + \epsilon^2 \frac12 v^t\nabla^2 f(z+t\epsilon v) v < f(z) + \nabla f(z) \cdot (\epsilon v)$$

So, because of Part $1$, we conclude that $f$ is not convex.