Why does the gradient descent backtracking line search terminate when $t \ge \beta / M$?

598 Views Asked by Bumbble Comm At 28 Mar 2026 - 11:53

From Convex Optimization by Boyd and Vendenberghe:

Let $\tilde f(t) = f(x - t\nabla f(x))$ for $x \in S = \{x : f(x) \le f(x^{(0)})\}$.

Let $\alpha \in (0, 0.5), \beta \in (0,1)$ be the parameters of the backtracking line search algorithm.

Let $M$ be the number such that $\nabla^2f(x) \preceq MI$.

Then the backtracking exit condition $\tilde f(t) \le f(x) - \alpha t ||\nabla f(x)||_2^2$ is satisfied whenever $0 \le t \le 1/M.$

Proof:

$0 \le t \le 1/M \Rightarrow -t + Mt^2/2 \le -t/2$

$\tilde f(t) \le f(x) - t||\nabla f(x)||_2^2 + (Mt^2/2)||\nabla f(x)||_2^2$

$\le f(x) - (t/2)||\nabla f(x)||_2^2$

$\le f(x) - \alpha t||\nabla f(x)||_2^2$ since $\alpha \le 1/2$

Then the backtracking line search terminates either with $t=1$ or with a value $t \ge \beta / M$.

Why does the gradient descent backtracking line search terminate when $t \ge \beta / m$?

Doesn't this just show that it terminates when $t \le 1/M$?

Original Q&A

There are 2 best solutions below

Bumbble Comm On 12 Jun 2019 - 4:42

The algorithm is initialized with $t_0=1$. It might terminate immediately or we reduce $t$ by multiplying by $\beta$. That is $t_i = \beta t_{i-1}$. We let $t_T$ denote the termination $t$, that is the algortihm terminates at step $T$.

We claim that $t_T \ge \frac{\beta}{M}$.

Suppose not and we have $t_T < \frac{\beta}{M}$ instead, then $t_{T-1} > \frac1M$.

Since $\beta \in (0,1)$, we have $$\beta t_{T-1}> \frac{\beta}M$$

$$t_T > \frac{\beta}{M}$$

which is a contradiction.

That is we must have $t_T \ge\frac{\beta}{M}$.

Bumbble Comm On 08 Aug 2019 - 12:18

The argument of Siong Thye Goh can be simplified. Denote $t_T$ the termination $t$ (the backtracking line search algorithm eventually terminates). Since the algorithm terminates if $0\le t\le 1/M$, we have that $t_{T-1}>1/M$. Hence, $\beta t_{T-1}>\beta/M$ and $t_T>\beta/M$ for $\beta\in(0,1)$. We conclude that $t>\beta/M$ and of course $t\ge\beta/M$.

So there is no need to use proof by contradiction.

Why does the gradient descent backtracking line search terminate when $t \ge \beta / M$?

There are 2 best solutions below

Related Questions in PROOF-EXPLANATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions