Prove $\nabla^2 f(x) \preceq L I$ for convex $f$ with Lipschitz gradient

1.2k Views Asked by At

I am given a convex and twice differentiable function $f$ whose gradient is Lipschitz with constant "$L$". I am trying to prove $$ \nabla^2 f(x) \preceq L\,I. $$ I recognize that this is pretty easy, and suspect the proof will invoke the definition of the Hessian.

From $f$ having Lipschitz gradient, we know $\|\nabla f(x) - \nabla f(y)\| \lt L\,\|x - y\|$.

And therefore $$ \lim_{x \to y} \frac{\|\nabla f(x) - \nabla f(y)\|}{\|x - y\|} \lt \frac{L\,\|x - y\|}{\|x - y\|} ~~\Longrightarrow ~~\nabla^2 f(x) \preceq LI. $$ But this isn't quite right since the Hessian is a matrix and the quantities inside the limit to the left of the implication arrow are scalars. What is the proper way to prove this?

Any help here would be helpful :-)

2

There are 2 best solutions below

2
On

@Brian-Borchers

Okay, it looks like using a first order Taylor expansion of the gradient could work. Please let me know if this is right.

Using the first order Taylor expansion around a point $x_0$ for the gradient gives us $$ \nabla f(x) = \nabla f(x_0) + (\nabla^2 f(\xi)) (x - x_0) $$ where $\xi$ is some point between $x$ and $x_0$. This implies $$ \nabla f(x) - \nabla f(x_0) = (\nabla^2 f(\xi)) (x - x_0). $$ Taking norms we have $$ \| \nabla f(x) - \nabla f(x_0) \| = \| (\nabla^2 f(\xi)) (x - x_0) \|. $$ Since $\| \nabla f(x) - \nabla f(x_0) \| \le L\,\| x - x_0 \|$ by assumption, this implies $$ \| (\nabla^2 f(\xi)) (x - x_0) \| \le L \, \|x - x_0\|. $$ Therefore $$ \frac{\| (\nabla^2 f(\xi)) (x - x_0) \|}{\|x - x_0\|} \le \frac{L \, \|x - x_0\|}{\|x - x_0\|} $$ where the vector $x - x_0$ is arbitrary. Therefore $$ \nabla^2 f(\xi) \preceq L \, I. $$

Does that do it? Thanks again.

0
On

@Brian-Borchers

Okay, here's another stab. Assume there exists $x_0$ such that $\nabla^2 f(x_0) \succ L\,I$. Then there exists a vector $v$ such that $$ \frac{\|\,(\nabla^2 f(x_0))\,v\,\|}{\|v\|} > L. $$ Let $x = x_0 + v$ and we assume $\|v\|$ is small enough such that $x \in D(x_0,\epsilon) \subseteq \text{Domain}(f)$.

By Taylor's theorem, we know$^*$ that $$ \nabla f(x) = \nabla f(x_0) + (\nabla^2 f(x_0))(x - x_0) + o(\|x - x_0\|)(x-x_0). ~~~~~~~(1) $$ Now, taking norms and taking the limit as $x \to x_0$ we have for some $x$ sufficiently close to $x_0$ $$ \lim_{x\to x_0} \Biggl\{\|\nabla f(x) - \nabla f(x_0) \| = \Bigl\| (\nabla^2 f(x_0))(x - x_0) + o(\|x - x_0\|)(x-x_0) \Bigr\| > L \, \|x - x_0\| \Biggr\} $$

This contradicts our assumption that $\|\nabla f(x) - \nabla f(x_0) \| \le L \, \|x-x_0\|$ $~~~~\square$

$^*$However, I am not sure about my claim that equation (1) is true. I am not familiar with how to handle the remainder term in a multivariate Taylor expansion.