Convergence Speed for Optimization Methods on non Lipschitz and strictly monotone funtions

226 Views Asked by At

I am studying convergence analysis for some optimization techniques, so this could be a naive question. In the derivation of convergence speed for Gradient descend (and Newton method), they usually assume Lipschitz condition on the first derivative (and second derivative). My question is as follows, consider a strictly decreasing function that does not satisfy the Lipschitz conditions, is there a convergence analysis for Gradient descend (or other techniques) for such functions? The simple example in mind is the following function with $N$ variables: $$ \sum_i^N \frac{1}{x_i} , ~~~~ x_i \in [0,1]$$ Now we can simply minimize the function by letting all $x_i = 1$. A simlar function is $$ \sum_i^N e^{x_i}, ~~~~ ~~~ x_i \in [0 , \infty ) $$ any clarification is appreciated.

2

There are 2 best solutions below

2
On

Perhaps what you're looking for is the subgradient descent algorithm, which works without assuming that the function $f$ being optimized is even differentiable; one only needs continuity, and a Lipschitz condition on $f$ itself, i.e. $$ |f(x)-f(y)|\leq L\|x-y\|.$$ Of course, we also need access to the subgradient information, so we assume we have a subgradient oracle for $f$, meaning that for each point $x$ we can get a vector $g(x)$ such that $$f(y)\geq f(x) + \langle g(x),y-x\rangle$$ for all $y$.

This is sufficient to show a rate of convergence for gradient descent where the function error of the average iterate is $O(1/\sqrt{t})$ after $t$ iterations. You might want to take a look at the analysis in section 1.5 here.

0
On

I believe that self-concordance can be used as an alternative to Lipschitz conditions. Though, to be sure, someone may want to check this assertion a little more carefully. El Ghaoui has some lecture notes that talk about the difference between traditional convergence analysis and those using self-concordance here. Boyd and Vandenberghe also discuss self-concordance in their book Convex Optimization starting on page 496 and their book can be downloaded for free here. Finally, in Nesterov's book, Introductory Lectures on Convex Optimization, chapter 4 is dedicated to understanding the motivations behind the traditional proof for Newton's method, including Lipschitz conditions, and then how to understand things in terms of self-concordance.