connection among big-M, Lagrangian, Pentalty Method, and Augmented Lagrangian

223 Views Asked by At

In the context of solving linear programs, the big-M method refers to adding additional variables to the problem such that there is, as far as I understand it, a trivial basic feasible solution. In order to ensure that a feasible solution to this new problem corresponds to a feasible solution of the original problem, the terms $M\cdot auxvar_i$ are added to the objective, with a really large $M$. This can be implemented by using a lexicographic numbering system $Mx>y$ for all $x,y$, or $M$ can actually just be chosen, barring numerical issues, such that it is large enough.

It seems, then, that with LPs, you can drive variables to zero by choosing some large but finite multiplier.

This reminds me of penalty methods for solving constrained optimization problems with unconstrained solvers, except that there is typically limiting behavior. Effectively, you need $M\rightarrow \infty$.

As an example, consider the following constrained minimization:

constrained QP: $\min_x \frac{1}{2}x^2$ subject to $x=1$

One penalty formulation is:

penalty QP: $\min_x \frac{1}{2}x^2 + \frac{1}{2}k(x-1)^2$

which has as a solution $x=\frac{k}{1+k}$.

As $k\rightarrow \infty$,we arrive at the correct answer, but until then there is always some fighting between the objective and the constraint.

It seems like this fighting doesn't happen with linear objectives. Part of my question is understanding why.

The other part of my question is understanding Lagrange multipliers as just the right coefficient to have on soft constraints. For the constrained quadratic program above, there is a $\lambda$ ($\lambda=-\frac{1}{2}$) such that the following unconstrained problem returns the same answer as the constrained problem: $\min_x \frac{1}{2}x^2 + \lambda(x-1)$.

The immediate difference between this and the penalty version is that the "soft constraint" is not quadratic. This I can see, but I'm searching for some deeper understanding. For example, it seems like I should be able to get rid of the squaring in the penalty version if I know which way that that the objective is pulling away from the constraint, then I know which sign to give $\lambda$. But this isn't right. Taking $\lambda \rightarrow \infty$ isn't the right thing to do.

Where is the parallel?