Augmented Lagrangian methods for nonlinear optimization

504 Views Asked by At

Let $f(\cdot):\mathbb{R^n} \rightarrow \mathbb{R},h(\cdot):\mathbb{R}^m \rightarrow \mathbb{R}$ be nonlinear functions. Consider the nonlinear optimization problem $$\begin{matrix}\min_x f(x)\\ \text{s.t. }h(x)=0\end{matrix}$$ The method of multipliers is based on the following algorithm:

  1. Select $\lambda_0>0\in \mathbb{R^m}$
  2. At step $i$: Compute $x_i=\min_x f(x)+\lambda_i^T h(x_i)+\frac{c}{2}||h(x)||^2$
  3. At step $i$: Compute $\lambda_{i+1}=\lambda_i+ch(x_i)$
  4. Go back to 2.

For a large enough $c \in \mathbb{R}$, the method converges to a local optimum $x^\star$ of $f(x)$ subject to $h(x)=0$ with associated Lagrange multiplier $\lambda^\star$.

My question is about step 2, which is another nonlinear optimization problem. From a practical point of view, it seems to me that we can only obtain local solutions to step 2, e.g. with gradient descent. However, the theory is based on the fact that $x_i$ is the global optimum of the augmented Lagrangian function. Can we still ensure convergence of the algorithm to a local minimum of $f(x)$ subject to $h(x)=0$, if at every step 2 we compute a local minimum of the augmented Lagrangian? Am I misenterpreting the theory? Thanks!

1

There are 1 best solutions below

0
On

Using the lagrangian or augmented lagrangian you should know that the solutions for the stationary points of

$$ \nabla L(x,\lambda) = 0 $$

are located at $L's$ saddle points or

$$ L(x^*,\lambda)\le L(x^*,\lambda^*)\le L(x,\lambda) $$

where $x^*, \lambda^*$ is a solution for $\nabla L = 0$. Once you got $x^*, \lambda^*$ you should check the kind of solution $x^*$ offers, using the bordered hessian of $f(x)$ and $h(x)$. This method appears to be more direct but it can converge to unexpected points.

The proposed sequential method is more reliable to get the solution to the minimization problem associated to $f(x), h(x)$