what is the intuition behind the augmented lagrangian?

345 Views Asked by At

why do we use this ($\frac{\rho}{2}||Ax - b||^2_2$) particular function in augmented lagrangian? i.e $L_ρ(x,z,y) = f(x) + y^T (Ax − b)+\frac{\rho}{2}||Ax - b||^2_2$

any intuition for that?

1

There are 1 best solutions below

0
On

Chapter 2 of [1] details the construction of augumented lagrangian. You may want to see Section 2.2.1 in [1] for the geometric intrepreation.

The highlevel overview is that as $\rho$ tends to $\infty$ one expects the constraints to satisfied exactly and thus we will not change the solutions of the original problem. In practice, sufficiently large $\rho$ will work (depends on $f(x)$).

The term is $\frac{\rho}{2}\|Ax-b\|_2^2$ is the penalty term which essentially means that you want to achieve $Ax=b$, otherwise certain positive loss is incurred. You can in principle use any other distances, for example, Bregman distances as in [5].

For theoretical convergence gaurantees, it is good to have the quadratic term in the objective because it is strongly convex. This implies that we can work with mild assumptions on $f(x)$. For example, even if $f$ is convex, augumented lagrangian $L_{\rho}$ is actually strictly convex (in some cases strongly convex) with respect to $x$ (fix $y$ here). Regarding the discussion, you may want to check Section 2.3 of [4].

References.

[1] http://www.mit.edu/~dimitrib/Constrained-Opt.pdf

[2] https://link.springer.com/content/pdf/10.1007/BF00934777.pdf

[3] https://web.stanford.edu/~boyd/papers/pdf/admm_slides.pdf

[4] https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf

[5] https://arxiv.org/abs/1306.3203