Gradient equality with an extra term

47 Views Asked by At

Suppose that the gradient of a function $f: \mathbb{R}^n → \mathbb{R}$ is Lipschitz continuity; that is, there exists $L > 0$ such that $$∥∇f(x)−∇f(y)∥≤L∥x−y∥,$$ for all $x,y\in\mathbb{R}^n.$ Prove that for any $u,v∈ \mathbb{R}^n$ we have $$f(u) ≤ f(v) + ∇f(v)^⊤(u − v) + L∥u − v∥^2.$$

I know that $$f(u) ≤ f(v) + ∇f(v)^⊤(u − v)$$ is straightforward if $f$ is convex. But how can I add an extra term, i.e., $L∥u − v∥^2$ into the inequality in the case that $f$ is not necessary convex like this?

1

There are 1 best solutions below

0
On BEST ANSWER

Welcome to this community. Let's do this proof.

First of all, observe that for all $x,y\in\mathbb{R}^n$

\begin{equation*} \begin{split} [\nabla f(x)-\nabla f(y)]^\top(x-y) & := \langle \nabla f(x)-\nabla f(y), x-y \rangle \leq |\langle \nabla f(x)-\nabla f(y), x-y \rangle|^2 \\ &\leq^{(*)} ||\nabla f(x)-\nabla f(y) ||\cdot ||x-y|| \leq L ||x-y||^2 \end{split} \end{equation*} where in $(*)$ we have used the Schwarz inequality.

Now, we define the function $$g(x):=\frac{L}{2}x^\top x -f(x),$$ whose gradient can be obtained easily as $$\nabla g(x)=Lx-\nabla f(x).$$ Then, apply the previous expression we have deduced, we have \begin{equation*} \begin{split} [\nabla g(x)-\nabla g(y)]^\top(x-y) & := [Lx-\nabla f(x)-Ly+\nabla f(y)]^\top (x-y) \\ & = [\nabla f(y)-\nabla f(x)]^\top (x-y)+L (x-y)^\top (x-y) \\ & = -[\nabla f(x)-\nabla f(y)]^\top (x-y)+L (x-y)^\top (x-y) \\ & \geq -L (x-y)^\top (x-y)+L (x-y)^\top (x-y) = 0, \\ \end{split} \end{equation*} which implies that $g$ is convex by the monotone gradient condition.

Since $g$ is convex, then $$g(y) \geq g(x)+\nabla g(x)^\top (y-x)$$ is satisfied, so \begin{equation*} \begin{split} f(y) & \leq f(x)+\frac{L}{2}y^\top y -\frac{L}{2}x^\top x-[Lx-\nabla f(x)]^\top(y-x) \\ & = f(x)+ \nabla f(x)^\top (y-x)+\frac{L}{2}y^\top y+\frac{L}{2}x^\top x -Lx^\top y\\ & = f(x)+ \nabla f(x)^\top (y-x)+\frac{L}{2}\left[y^\top y+x^\top x -2x^\top y \right]\\ & = \nabla f(x)^\top (y-x)+\frac{L}{2} ||x-y||^2\\ \end{split} \end{equation*}