Smoothness means that the gradient is Lipschitz continuous

3.7k Views Asked by At

Let $f:\mathbb{R}^d\rightarrow \mathbb{R}$ be a convex and differentiable function. We are saying that $f$ is smooth (with parameter $L$) if $\forall \mathbf{x}, \mathbf{y} \in \mathbb{R}^d: f(\mathbf{y}) \leq f(\mathbf{x}) + \nabla f(\mathbf{x})^{T} (\mathbf{y} - \mathbf{x}) + \frac{L}{2}\|\mathbf{x}-\mathbf{y}\|^2$

I am trying to prove the following Lemma:

Let $f:\mathbb{R}^d \rightarrow \mathbb{R}$ be convex and differentiable. The following two statements are equivalent:

(i) f is smooth with parameter $L$

(ii) $\|\nabla f(\mathbf{x}) - \nabla f(\mathbf{y})\| \leq L\|\mathbf{x}-\mathbf{y}\| \ \forall \mathbf{x}, \mathbf{y} \in \mathbb{R}^d$

I've managed to do (ii) $\rightarrow$ (i) by using the first order characterization of convexity, subtracting and adding the term $\nabla f(\mathbf{x})^T(\mathbf{y}-\mathbf{x})$ and then by using the Cauchy-Schwarz inequality.

However I am stuck in (i) $\rightarrow$ (ii). I am taking the smoothness condition one time for $\mathbf{x}, \mathbf{y}$ and one time for $\mathbf{y}, \mathbf{x}$ and I am adding the inequalities produced leading to

$(\nabla f(\mathbf{x}) - \nabla f(\mathbf{y}))^T (\mathbf{x} - \mathbf{y}) \leq L \|\mathbf{x} - \mathbf{y} \|^2 $

but this doesn't seem to lead somewhere. Any ideas on how to prove this?