variant of Levenberg-Marquardt suitable for Lagrange-Newton method

29 Views Asked by At

If a Newton descent is applied to some scalar function of a vector, the Hessian is positive definite in the vicinity of a minimum, but can become indefinite in larger distance from the minimum. This causes the Newton descent to diverge.

The solution found in the Levenberg-Marquardt algorithm is to add either a multiple of a unit matrix to the Hessian, or to add a multiple of the Hessian's main diagonal to the Hessian. Essentially this manipulates the eigenvalues of the Hessian, and the idea is to increase the factor until all eigenvalues become positive (and reduce the factor later on). This is probably based on the assumption that the negative original eigenvalues are at least small (in their absolute values) and therefore easily forced to positive values.

However, if a Newton descent is applied to a Lagrangian scalar function by including the Lagrange multipliers in the state vector, the Hessian is always indefinite since any solution is a saddle point. So the eigenvalues of the Hessian are negative even in the vicinity of the solution. (Actually that's why you can't iterate to a solution by a gradient descent but need a Newton descent. The inverse Hessian turns the saddle into an attractor.)

So, for Lagrangians, forcing the eigenvalues to become positive through the Levenberg-Marquardt method will probably prevent convergence. Still, without manipulating the Hessian, there must be some effect that prevents convergence from larger distances to the solution, and so there should be a need to manipulate the Hessian.

My questions are (all related to the the special case of Newton descent in a Lagrangian):

  1. How does the Hessian change in larger distance from the solution?
  2. Does this affect convergence?
  3. Is there an alternative method to Levenberg-Marquardt which is suitable for this case?