How to show two different definitions of $\alpha$-strongly convex are equivalent?

809 Views Asked by At

In my optimization lecture notes I have the following definition:

Definition: Let $f: \mathbb{R}^n \rightarrow \mathbb{R}$ be differentiable. The $f$ is strongly convex it $\exists$ a positive constant $\alpha > 0$ such that $$ \langle \nabla f(y) - \nabla f(x),y-x \rangle \geq \alpha||y-x||^2 \qquad,\forall x,y \in \mathbb{R}^n \tag{1} $$ However, in the Online Linear Optimization via Smoothing on page 2, it says the function is $\beta$-strongly convex if

$$ f(y) -f(x) - \langle \nabla f(x),y-x \rangle \geq \frac{\beta}{2}||y-x||^2 \qquad \forall x,y \in \mathbb{R}^n \tag{2} $$

How can we show that (1) is equivalent to (2) with appropriate choice of $\beta$?

2

There are 2 best solutions below

1
On BEST ANSWER

Let’s take $(1)$ and write $g_1=f-\frac{\alpha}{2}\|\cdot\|_2^2$. Then $(1)$ is equivalent to $\langle \nabla g_1(y)-\nabla g_1(x),\,y-x \rangle \geq 0$, ie $g_1$ convex.

Do the same for $(2)$ with $g_2= f-\frac{\beta}{2}\|\cdot\|_2^2$.

0
On

I will post the following derivation which is expressed in Mindlak's answer just to have better understanding:

From (1) we have:

$$ \langle \nabla f(y) -\nabla f(x),y-x \rangle \geq \alpha||y-x||^2= \alpha \langle y - x,y-x \rangle\,\,\,\,\,,\,\,\,\forall x,y \in \mathbb{R}^n $$ So, $$ \langle \nabla f(y) - \alpha y -(\nabla f(x)-\alpha x),y-x \rangle \geq 0\,\,\,\,\,,\,\,\,\forall x,y \in \mathbb{R}^n \tag{3} $$ (3) means $g_1(x)=f(x)-\frac{\alpha}{2} \|x\|_2^2$ is convex, because it is the monotonicity property of convex function. From $$ g_1(s) \geq g_1(t) + \langle \nabla g_1(t),s-t \rangle \,\,\,\,\,,\,\,\,\forall s,t \in \mathbb{R}^n $$ we have

$$ f(y)-\frac{\alpha}{2} \|y\|_2^2 \geq f(x)-\frac{\alpha}{2} \|x\|_2^2 + \langle \nabla f(x)-\alpha x,y-x \rangle \,\,\,\,\,,\,\,\,\forall y,x \in \mathbb{R}^n $$ Yields

$$ f(y)-f(x)+ \langle \nabla f(x),y-x \rangle \geq -\frac{\alpha}{2} \|x\|_2^2 +\frac{\alpha}{2} \|y\|_2^2 +\langle -\alpha x,y-x \rangle =\frac{\alpha}{2} \|y-x\|_2^2 \\ \forall \,\,\,\,\,y,x \in \mathbb{R}^n $$ and $\alpha = \beta$