Solving constrained minimisation problem using unconstrained optimization of the generalized Lagrangian

Question

Solving constrained minimisation problem using unconstrained optimization of the generalized Lagrangian

745 Views Asked by Bumbble Comm At 28 Mar 2026 - 8:37

My textbook, Deep Learning by Goodfellow, Bengio, and Courville, says the following in a section on constrained optimization:

The Karush-Kuhn-Tucker (KKT) approach provides a very general solution to constrained optimization. With the KKT approach, we introduce a new function called the generalized Lagrangian or generalized Lagrange function.

To define the Lagrangian, we first need to describe $\mathbb{S}$ in terms of equations and inequalities. We want a description of $\mathbb{S}$ in terms of $m$ functions $g^{(i)}$ and $n$ functions $h^{(j)}$ so that $\mathbb{S} = \{ \boldsymbol{\mathcal{x}} \mid \forall i, g^{(i)}(\boldsymbol{\mathcal{x}}) = 0 \ \text{and} \ \forall j, h^{(j)} (\boldsymbol{\mathcal{x}}) \le 0 \}$. The equations involving $g^{(i)}$ are called the equality constraints, and the inequalities involving $h^{(j)}$ are called inequality constraints.

We introduce new variables $\lambda_i$ and $\alpha_j$ for each constraint, these are called the KKT multipliers. The generalized Lagrangian is then defined as

$$L(\boldsymbol{\mathcal{x}}, \boldsymbol{\lambda}, \boldsymbol{\alpha}) = f(\boldsymbol{\mathcal{x}}) + \sum_i \lambda_i g^{(i)} (\boldsymbol{\mathcal{x}}) + \sum_j \alpha_j h^{(j)}(\boldsymbol{\mathcal{x}}) \tag{4.14}$$

We can now solve a constrained minimisation problem using unconstrained optimization of the generalized Lagrangian. As long as at least one feasible point exists and $f(\boldsymbol{\mathcal{x}})$ is not permitted to have value $\infty$, then

$$\min_{\boldsymbol{\mathcal{x}}} \max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}}) \tag{4.15}$$

has the same optimal objective function value and set of optimal points $\boldsymbol{\mathcal{x}}$ as

$$\min_{\boldsymbol{\mathcal{x}} \in \mathbb{S}} f(\boldsymbol{\mathcal{x}}). \tag{4.16}$$

This follows because any time the constraints are satisfied,

$$\max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}}) = f(\boldsymbol{\mathcal{x}}),$$

while any time a constraint is violated,

$$\max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}}) = \infty$$

these properties guarantee that no infeasible point can be optimal, and that the optimum within the feasible points is unchanged.

I'm having difficulty understanding how $$\min_{\boldsymbol{\mathcal{x}}} \max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}})$$

has the same optimal objective function value and set of optimal points $\boldsymbol{\mathcal{x}}$ as

$$\min_{\boldsymbol{\mathcal{x}} \in \mathbb{S}} f(\boldsymbol{\mathcal{x}})$$

Specifically, I am not seeing how the latter claim that any time the constraints are satisfied,

$$\max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}}) = f(\boldsymbol{\mathcal{x}}),$$

while any time a constraint is violated,

$$\max_{\boldsymbol{\mathcal{\lambda}}} \max_{\boldsymbol{\mathcal{\alpha, \alpha}}\ge 0} L(\boldsymbol{\mathcal{x}}, \boldsymbol{\mathcal{\lambda}}, \boldsymbol{\mathcal{\alpha}}) = \infty$$

I would greatly appreciate it if people could please take the time to clarify this.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 08 Oct 2019 - 10:10

If $x\in S$ then $f(x) \ge L(x,\lambda,\alpha)$ for all $\lambda,\alpha$ with $\alpha\ge0$.

If $x\not\in S$ then one of the constraints is violated,i.e., $h_i(x)\ne0$ or $g_j(x)\>0$ for some $i$ or $j$. By taking the corresponding multiplier $\lambda_i$ or $\alpha_j$ large enough, one see $\sup_{\lambda,\alpha\ge0} L(x,\lambda,\alpha)=+\infty$.

**Bumbble Comm** · Accepted Answer

Starting with:

$$L(\boldsymbol{\mathcal{x}}, \boldsymbol{\lambda}, \boldsymbol{\alpha}) = f(\boldsymbol{\mathcal{x}}) + \sum_i \lambda_i g^{(i)} (\boldsymbol{\mathcal{x}}) + \sum_j \alpha_j h^{(j)}(\boldsymbol{\mathcal{x}}) \tag{4.14}$$

If the constraints are satisfied, then $g^{(i)}(x)=0$ and $h^{(j)}(x)\leq0$. Therefore, the terms with $\lambda$ all vanish, and the terms with $\alpha$ attain their maximum over $\alpha_j$ at $\alpha_j=0$ (because the last term cannot be positive), so also those terms vanish, leaving you with $f(x)$.

On the other hand, suppose a constraint is not satisfied. If $g^{(i)}(x)\neq 0$ for some $i$, you can let $\lambda_i g^{(i)}(x)$ go to infinity by letting $\lambda_i$ go to $\infty$ if $g^{(i)}(x)>0$, and $\lambda_i \to -\infty$ otherwise. Similarly, if $h^{(j)}(x)>0$ for some $i$, you can let $\alpha_j h^{(j)}(\boldsymbol{\mathcal{x}})$ go to $\infty$ by letting $\alpha_j \to \infty$.

Solving constrained minimisation problem using unconstrained optimization of the generalized Lagrangian

There are 2 best solutions below

Related Questions in OPTIMIZATION

Related Questions in MACHINE-LEARNING

Related Questions in CONSTRAINTS

Related Questions in KARUSH-KUHN-TUCKER

Trending Questions

Popular # Hahtags

Popular Questions