I understand the gradient descent algorithm, but having trouble how it relates to line search. Is gradient descent a type of line search?
What is the difference between line search and gradient descent?
2.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
The following figure shows the hierarchy of some of the line search optimization methods for quadratic forms. Indeed, the methods are categorized based on choosing the descent direction $p_k$ and the length step $\alpha_k$. Just recall that
\begin{align} \phi(x) &= \frac{1}{2}x^{\top}A \, x + b^\top x + c,\\ x_k &= x_{k-1} + \alpha_{k} p_k, \\[0.5em] r_{k-1} &= -\nabla\phi(x_{k-1}) = b - Ax_{k-1}. \end{align}
where $A$ is required to be positive definite. Now, suppose that we are in $k$th iteration of the algorithm with $r_{k-1}\ne 0$ (otherwise the solution is found) and look at the following diagram. You should also be familiar with the following results
Lemma. In exact line search, the coefficients for updating solutions are defined as $\alpha_{k}:= \arg\underset{\alpha \in \mathbb{R}}{\min} \phi(x_{k}+\alpha p_k)$. This can be further simplified to $\alpha_{k}=p_k^\top r_{k-1} /p_k^\top A\, p_k$.
Theorem. In exact line search, if $p_k^\top r_{k-1} \ne 0$ then then it is guaranteed that $\phi(x_{k}) < \phi(x_{k-1})$. More, specifically, we have $$\phi(x_{k}) = \phi(x_{k-1}) -\frac{1}{2}\frac{(p_k^\top r_{k-1})^2}{p_k^\top A \, p_k}.$$

Gradient descent employs line search to determine the step length.
An iterative optimization problem for solving $\min_x f(x)$ that is currently at the point $x^k$ yields a search direction $\Delta x^k$. Gradient descent does too. The next iterate is given by $x^{k+1} = x^k + \alpha \Delta x^k$. Line search is about finding a good value for $\alpha$. Exact line search solves $\min_{\alpha} f(x^k + \alpha \Delta x^k)$. Since this is a univariate problem, a typical solution method is bisection search (for convex problems).