Intuition behind smooth convex function enjoy better faster optimization convergence rate

346 Views Asked by At

I am following the derivation in Convex Optimization: Algorithms and Complexity.

It is reported that, the projected subgradient descent methods for a convex non-smooth function enter image description here

where $L$ is the Lipschatiz bound for function $f$ and $R=||x_1 - x^*||$. And if we have smooth condition:

enter image description here

The convergence rate is faster. One important component behind the speedup is there is a bound for one-step improvement.

enter image description here

But do we have intuitive explanations for the speedup of smooth convex function?

1

There are 1 best solutions below

0
On

Consider convex functions $f(x)=x^2$ and $g(x)=|x|$. $f$ is a smooth function while $g$ is a Lipschitz function. An undesirable property of $g$ is that its gradient has an abrupt change at the minimum. Therefore, when you want to design a gradient descent algorithm for this problem, you have to use very small learning rate, otherwise, GD will oscillate around the minimum.

However, the desirable property of the smooth functions is that their gradients become small near the minimum. The reason behind this claim is the definition of smoothness: $$\| \nabla f(x)-\nabla f(x^\star)\|=\| \nabla f(x)\|\leq \beta \|x-x^\star\|$$ I think this property let us chose bigger step sizes, and hence better convergence.