What is the difference between Gradient Descent method and Steepest Descent methods?
In this book, they have come under different sections:
http://stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf
According to page 480, Gradient Descent is:
$$\Delta x=-\nabla f(x)$$
While page 490, for Steepest descent says:
$$\Delta x_{sd}=||\nabla f(x)||_*\Delta x_{nsd}$$ $$\Delta x_{nsd} = \text{argmin}\{\nabla f(x)^Tv~|~~~ ||v||\leq 1\}$$
I cannot understand their difference. How they are mathematically and geometrically different?
I am reading this book too, this is also a problem for me for a long time. The direction of gradient descent method is negative gradient. However the direction of steepest descent method is the direction such that
$Δx_{\text{nsd}}=\text{argmin}\{∇f(x)^Tv \quad| \quad ||v||≤1\}$
which is negative gradient only if the norm is euclidean. If the norm is other quadratic or l1norm, the result are not negative gradient.
The direction is -inv(P)*∇f(x), if the norm is quadratic norm.