What is the motivation behind strong convexity

1.6k Views Asked by At

Definition : A function is said to be $\beta$-strongly convex if,

$f(\theta w + (1-\theta) w') \le \theta f(w) + (1-\theta) f(w') - \frac{\beta}{2}\theta(1-\theta)(w-w')^2$

What is the motivation behind the $\frac{\beta}{2}\theta(1-\theta)(w-w')^2$ term?

2

There are 2 best solutions below

3
On BEST ANSWER

With this definition, $f(w)$ is $\beta$-strongly convex if and only if $g(w)=f(w) - \frac{\beta}{2}\|w\|^2$ is convex. That is, if you expand out the terms you'll see the following expressions are equivalent: $$ \begin{aligned} f(\theta w + (1-\theta) w') &\le \theta f(w) + (1-\theta) f(w') - \frac{\beta}{2}\theta(1-\theta)\|w-w'\|^2 \\ \Leftrightarrow g(\theta w + (1-\theta) w') &\le \theta g(w) + (1-\theta) g(w') \end{aligned} $$ This means $f$ can be written as the sum of convex function $g$ and a quadratic term: $f(w)=g(w)+ \frac{\beta}{2}\|w\|^2$.

Also, a strongly convex function has a unique minimizer. Suppose $w$ and $w'$ both minimize $f$ and $w\ne w'$, then the strong convexity condition implies $f((w+w')/2) < (f(w)+f(w'))/2$, a contradiction. So, in fact we can express $f(w)=h(w)+ \frac{\beta}{2}\|w - w^*\|^2 + f(w^*)$, where $h$ is a nonnegative convex function, and $w^*$ is the unique minimizer of $f$.

0
On

First, note that any strongly convex function with $\beta>0$ is convex. You can note this by considering $\theta\in\left(0,1\right)$ and $w\neq w^{\prime}$to get that $$ \frac{\beta}{2}\theta\left(1-\theta\right)\left(w-w^{\prime}\right)>0 $$ and hence \begin{align*} f\left(\theta w+\left(1-\theta\right)w^{\prime}\right) & \leq\theta f\left(w\right)+\left(1-\theta\right)f\left(w^{\prime}\right)-\frac{\beta}{2}\theta\left(1-\theta\right)\left(w-w^{\prime}\right)^{2}\\ & <\theta f\left(w\right)+\left(1-\theta\right)f\left(w^{\prime}\right)-0\\ & =\theta f\left(w\right)+\left(1-\theta\right)f\left(w^{\prime}\right). \end{align*} However, the converse is not necessarily true. Strong convexity is thus a stronger (hence the name) requirement than strict convexity. Whenever you restrict the space of objects you are looking at, you make your life easier, but you will not be able to say something as general if you had not restricted your space. From Wikipedia, ``strongly convex functions are in general easier to work with than convex or strictly convex functions, since they are a smaller class.''

Here, strong convexity is trying to make your life easier by telling you just how strictly convex a function is. The larger the $\beta$, the more ``strictly'' convex your function is.