Speeding up the convergence for Steepest Descent Method

2k Views Asked by At

How would one speed up the convergence of the method of steepest descent when the minimum is in a very long, narrow structure? I know the fact that is a steep minimum covers more iterations to go down, but I'm unaware of how to project downwards in a more efficient manner.

Any thoughts can help!

4

There are 4 best solutions below

1
On

Scale the gradient by some factor at some iteration steps. The gradient points you in the right direction; if it's not fast enough, just give it a little shot of espresso. This method won't always work, but there are situations when it will.

0
On

You could try non-linear conjugate gradients (nl-CG).

The conjugate gradient method can follow narrow (ill-conditioned) valleys where the steepest descent method slows down and follows a criss-cross pattern. [wikipedia]

This is a very good introduction to CG. The nonlinear method is described in section 14.

0
On

Two common classes of improved algorithm are

  • (nonlinear) conjugate gradient, which works by choosing directions which are not independent each time so as to avoid criss- crossing as much.
  • quasi-Newton methods such as BFGS which work similarly to Newton's method in that they use (approximations to) the second derivative.
0
On

The simplest modification would be to scale the gradient by the inverse of the diagonal of the hessian:

$x_{k+1} = x_k −\gamma\mathrm{diag}( (\frac{\mathrm df^2(x)}{\mathrm d^2x_1})^{-1}, (\frac{\mathrm df^2(x)}{\mathrm d^2x_2})^{-1}, \dots, (\frac{\mathrm df^2(x)}{\mathrm d^2x_n})^{-1}) ∇f(x_k)$