TLDR: I can't figure out whether the standard multiplier update works for the partially augmented lagrangian case, and if not, whether there's an update that's more appropriate. Many thanks in advance to anyone who can help.
I have a problem of the form
$$\min f(x) \\ \mathrm{s.t.}\ c(x) = 0\\ l\leq x\leq u$$
That is, a nonlinear cost and nonlinear equality constraints, with simple bounds on $x$. I want to use the augmented lagrangian method to solve this problem, but with a partially augmented lagrangian -- I want to move the equality constraints $c(x)$ to the augmented lagrangian but leave the bounds $l\leq x\leq u$ as hard constraints. My proposed formulation is:
$L(x,\lambda):= f(x) + \lambda^Tc(x) + \rho/2||c(x)||_2^2.$
To solve: initialize some initial guess $x_0$, $\rho_0 = 1$, and $\lambda_0=0$, and iterate:
$x_{k+1} = \mathrm{argmin}_{x\in [l,u]}L(x,\lambda_k)$.
$\lambda_{k+1} = $ some update.
$\rho_{k+1} = $ some (monotone increasing) update.
The standard augmented lagrangian method typically uses the multiplier update $\lambda_{k+1} = \lambda_k + \rho_kc(x_{k+1})$. However, my concern is that the derivation for this update depends on $\nabla L(x_{k+1},\lambda_k)=0$.
Here are some things I have looked into as an attempt to resolve my confusion:
- Augmented Lagrangian and Differentiable Exact Penalty Methods" by Bertsekas 1981 (https://dspace.mit.edu/bitstream/handle/1721.1/947/p-1113-15613904.pdf?sequence=1) says on page 7 that problems of a form very similar to this can be solved as I suggested, and does not specify that the multiplier update should be any different from the fully augmented lagrangian case.
- "A Globally Convergent Augmented Lagrangian Algorithm for Optimization with General Constraints and Simple Bounds" Conn et al. 1991 addresses problems of this exact structure. They use the standard multiplier update I wrote above (eq 2.5 in their paper), but they also have lots of other stuff going on in their algorithm (e.g. varying tolerances on constraint/optimality satisfaction before moving between primal variable/multiplier/penalty parameter updates), and I cannot figure out whether all of these extra moving parts are necessary in order to make the multiplier update work for this partially augmented lagrangian.
- These slides: https://www.him.uni-bonn.de/fileadmin/him/Section6_HIM_v1.pdf on slide 5 give a derivation for this same multiplier update that does not depend on the gradient of the augmented lagrangian vanishing; however, the derivation feels less than rigorous.