From what I've seen, most of the proofs of convergence for gradient descent on convex functions assume that there exists at least one minimizer, i.e. for a convex $f: \mathbb{R} \rightarrow \mathbb{R}^d$ with $\inf_{x \in \mathbb{R}^d} f(x) > -\infty,$ there exists an $x_* \in \mathbb{R}^d$ such that $f(x_*) = \inf_{x \in \mathbb{R}^d} f(x).$ However, this assumption does not apply for e.g. the logistic loss $f(x) = \log(1 + \exp(-x))$ used for linear classification. How would I derive the convergence rate of gradient descent for a convex function without a minimizer? Let's suppose that the objective $f$ has Lipschitz gradient with constant $L$.
I looked through Nesterov's book and Boyd's book on convex optimization, and it seems that neither one of them point out this assumption or consider the case where it doesn't hold.