How to show $\frac{1}{L}\|\nabla f(y)-\nabla f(x)\|_2^2 \leq \langle \nabla f(y) - \nabla f(x) ,y-x \rangle$ for strongly convex function?

721 Views Asked by At

Suppose $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a strongly convex function, i.e.,

$$ \langle \nabla f(y) - \nabla f(x) ,y-x \rangle \geq \theta \|y-x\|_2^2 \,\,\,\,\, \forall x,y \in \mathbb{R}^n $$

or

$$ z^T(\nabla^2 f(x) -\theta I)z\geq 0 \,\,\,\,\forall z \neq 0 $$

Also, suppose $\nabla f(x)$ is a Lipschitz with Lipschitz constant $L>0$, i.e., $ \|\nabla f(y)- \nabla f(x)\|_2 \leq L \|y-x\|_2$.

Show $\frac{1}{L}\|\nabla f(y)-\nabla f(x)\|_2^2 \leq \langle \nabla f(y) - \nabla f(x) ,y-x \rangle$

2

There are 2 best solutions below

3
On BEST ANSWER

In order to get the conclusion we only need the convexity of $f(x)$ rather than the strong convexity. It is easy to show that if $\nabla f(x)$ is Lipschitz continuous with constant $L$ then we have:$$f(y)-f(x)-\langle\nabla f(x),y-x\rangle\leq\frac{L}{2}\| x-y\|^2$$ To show this inequality, we have:\begin{align} f(y)-f(x)-\langle\nabla f(x),y-x\rangle&=\int_{0}^{1}\langle \nabla f(x+t(y-x)),y-x \rangle dt-\langle\nabla f(x),y-x\rangle \\ &=\int_{0}^{1}\langle \nabla f(x+t(y-x))-\nabla f(x),y-x \rangle dt \\ &\leq\int_0^1\|\nabla f(x+t(y-x))-\nabla f(x)\|\|y-x\| dt\\ &\leq\int_0^1Lt\|y-x \|^2dt=\frac{L}{2}\|y-x\|^2 \end{align}

In order to get the result we need to consider the next function $$h(x)=f(x)-\langle \nabla f(y),x\rangle$$ It is clear that $h(x)$ is convex and also $\nabla h(x)$ is Lipschitz continuous with constant $L$, and hence $y\in\arg\min_{x}h(x)$ because $\nabla h(y)=0$. We have:$$ h(x-\frac{1}{L}\nabla h(x))\geq h(y) $$ Use the first inequality: $$ h(x-\frac{1}{L}\nabla h(x))\leq h(x)-\frac{1}{2L}\|\nabla h(x)\|^2 $$ We have: $$ h(y)\leq h(x)-\frac{1}{2L}\|\nabla h(x) \|^2 $$ $$f(y)+\langle\nabla f(y),x-y\rangle\leq f(x)-\frac{1}{2L}\| \nabla f(x)-\nabla f(y)\|^2 $$ $$f(y)-f(x)+\langle\nabla f(y),x-y\rangle\leq -\frac{1}{2L}\| \nabla f(x)-\nabla f(y)\|^2$$ Exchange $x,y$: $$f(x)-f(y)+\langle\nabla f(x),y-x\rangle\leq -\frac{1}{2L}\| \nabla f(x)-\nabla f(y)\|^2$$ Add them together then you can get the conclusion.

1
On

By definition of strong convexity we have the following:

\begin{equation} f(x) \geq f(y) + <\nabla f(y), x-y> + \frac{\sigma}{2} \left\lVert x-y\right\rVert^2 \end{equation}

Similarly we have:

\begin{equation} f(y) \geq f(x) + <\nabla f(x), y-x> + \frac{\sigma}{2} \left\lVert x-y\right\rVert^2 \end{equation}

If you sum up these two inequalities and organize them you will end up with the following:

\begin{equation} <\nabla f(y) - \nabla f(x), y-x> \geq \sigma.\left\lVert x-y\right\rVert^2 \end{equation}

Which is what you have. We call this function $\sigma$-strongly convex. To give an example, "negative entropy" is a 1-strongly convex (also known as 1-SC) function wrt L1-norm which is used as regularization term for making a non-smooth function smooth in L1-setting (in learning with experts setup)--Please note that in L1-setting everything is a simplex.

Additionally, you can use "Hölder's inequality" to prove the general case shown below (not just norm 2):

\begin{equation} \left\lVert \nabla f(y) - \nabla f(x)\right\rVert_{\infty} \left\lVert x-y \right\rVert_{1} \geq <\nabla f(x) - \nabla f(y), x-y> \geq \sigma.\left\lVert x-y\right\rVert_{1}^2 \end{equation}

enter image description here