Confusion related to augmented Lagrangian multiplier method (ADMM)

1.2k Views Asked by At

I have this confusion related to the augmented Lagrangian multiplier method from this tutorial.


enter image description here


How come the gradient with respect to $y$ is equal to $\rho(Ax^{k+1}-b)$?

1

There are 1 best solutions below

0
On

Here's one way to derive the augmented Lagrangian method. We use the proximal point method to solve the dual problem.

(In the derivation we freely use some basic facts about subgradients, conjugates, and prox operators which are discussed in convex optimization theory or in convex analysis.)

Consider the problem \begin{align*} \operatorname*{minimize}_x & \quad f(x)\\ \text{subject to} & \quad Ax = b \end{align*} where $f$ is a proper closed convex function. The Lagrangian is \begin{align*} L(x,z) &= f(x) + \langle z, Ax - b \rangle \\ &= f(x) - \langle -A^Tz, x \rangle - \langle z, b \rangle \end{align*} and the dual function is \begin{align*} G(z) &= \inf_x \quad L(x,z) \\ &= -f^*(-A^T z) - \langle z, b \rangle \end{align*} where $f^*$ is the conjugate of $f$. The dual problem, written as a minimization problem, is to minimize $g(z) = f^*(-A^T z) + \langle z, b \rangle$ with respect to $z$.

Let's solve the dual problem using the proximal point method. The proximal point iteration is \begin{align*} & z^+ = \text{prox}_{tg}(z) \\ \iff & \frac{z - z^+}{t} \in \partial g(z^+) \\ \iff & \frac{z - z^+}{t} \in -A \partial f^*(-A^T z^+) + b \\ \iff & \frac{z - z^+}{t} = -Ax + b \\ \iff & z^+ = z + t(Ax - b) \end{align*} for some $x$ satisfying \begin{align*} & x \in \partial f^*(-A^T z^+) \\ \iff & -A^T z^+ \in \partial f(x) \\ \iff & 0 \in A^T z^+ + \partial f(x) \\ \iff & 0 \in A^T(z + t(Ax - b)) + \partial f(x) \\ \iff &x \in \arg\min_u f(u) + \frac{t}{2} \| Au - b + \frac{z}{t} \|^2. \end{align*}

So, the proximal point iteration to solve the dual problem is: \begin{align*} x &= \arg\min_u f(u) + \frac{t}{2} \| Au - b + \frac{z}{t} \|^2 \\ z^+ &= z + t(Ax - b). \end{align*}

This is identical to the method of multipliers iteration.