Gradient descent on a convex function without a minimizer

31 Views Asked by At

From what I've seen, most of the proofs of convergence for gradient descent on convex functions assume that there exists at least one minimizer, i.e. for a convex $f: \mathbb{R} \rightarrow \mathbb{R}^d$ with $\inf_{x \in \mathbb{R}^d} f(x) > -\infty,$ there exists an $x_* \in \mathbb{R}^d$ such that $f(x_*) = \inf_{x \in \mathbb{R}^d} f(x).$ However, this assumption does not apply for e.g. the logistic loss $f(x) = \log(1 + \exp(-x))$ used for linear classification. How would I derive the convergence rate of gradient descent for a convex function without a minimizer? Let's suppose that the objective $f$ has Lipschitz gradient with constant $L$.

I looked through Nesterov's book and Boyd's book on convex optimization, and it seems that neither one of them point out this assumption or consider the case where it doesn't hold.