On a reference request for a converge theorem regarding gradient descent.

86 Views Asked by At

Polyak in his book Introduction to Optimization (the book is freely downloadable at his personal web page here, year 1987) states at page 25 the following theorem

Let $f(x)$ be twice differentiable and let $$ I \ell \le \nabla^2 f(x) \le IL, \quad \ell > 0, \ L \in \mathbb{R} $$ for all $x$. Then for all $0 < \alpha < 2/ L$ $$ || x_k - x^* || \le ||x_0 - x^* || q^k , \quad q:= \max \{ |1- \alpha \ell |,|1- \alpha L | \} < 1$$

where $x^*$ is the unique minimum of the function $f(x)$ and $x_k$ is the $k$-th step of the gradient descent scheme

$$ x_{k+1} = x_k + \alpha \nabla f(x_k) $$ starting at $x_0$.

I am looking for a result of this kind ( where the convergence of gradient descent is proven for $x_k \rightarrow x^*$ and not $f(x_k) \rightarrow f(x^*)$ ) in the case that the objective function $f(x)$ is not strongly convex but just strictly convex or convex, i.e., when

$$ 0 \le \nabla^2 f(x) \le IL, \quad \ L \in \mathbb{R} $$