Non-linear optimization with Equality constraints: Shouldn't a different hessian be defined?

36 Views Asked by At

Background

I am studying some theory for non-linear optimization, and i am currently studying about Lagrange multipliers.

According to this, the classical approach to solve an "Equality constrained Optimization", that is:

$$ min \ \boldsymbol{F}(\boldsymbol{x}) $$

s.t. m<n equality constaints:

$$ \boldsymbol{c}(\boldsymbol{x}) =\boldsymbol{0} $$

is to define a lagrangian:

$$ L(\boldsymbol{x}, \boldsymbol{\lambda }) = F(\boldsymbol{x}) - \sum_{i=1}^{m} \lambda_i c_i(\boldsymbol{x}) $$

And then:

In a manner analogous to the unconstrained case, optimality requires that derivatives with respect to both $\boldsymbol{x}$ and $\boldsymbol{\lambda}$ be zero. More precisely, necessary conditions for the point ($\boldsymbol{x*}$ , $\boldsymbol{\lambda *}$ ) to be an optimum are:

$$ \nabla_xL(\boldsymbol{x*}, \ \boldsymbol{\lambda *}) = 0 $$ $$ \nabla_{\lambda}L(\boldsymbol{x*}, \ \boldsymbol{\lambda *}) = 0 $$

Which end up:

$$ \nabla_{x}L = \nabla F - \sum_{i=1}^{m} \lambda_i \nabla c_i = 0 $$ and $$ \nabla_{\lambda}L = - \boldsymbol{c}(\boldsymbol{x}) = 0 $$

I understand the meaning of these conditions. The second one is the constraint, and the first says that the gradient of the objective function and the constraints must be parallel. So any change in the objective function (which would have a componenent in $\nabla F$ ) will also change the value of $c(x)$ and thus the constraints will be violated.

Assumption 1 My understanding is that the decision variables are both $\boldsymbol{x}$ and $\boldsymbol{\lambda }$. While $\boldsymbol{\lambda}$ doesn't have any meaningful meaning, it is used to make a constrained optimization problem to an unconstrained one. So, the problem is like that:

$$ min_{\boldsymbol{x},\ \boldsymbol{\lambda} } \ L(\boldsymbol{x},\boldsymbol{\lambda }) $$

Question

I do not understand the following:

Just as in the unconstrained case, these conditions do not distinguish between a point that is a minimum, a maximum, or simply a stationary point. As before, we require conditions on the curvature of the objective. Let us define the Hessian of the Lagrangian to be:

$$ \boldsymbol{H}_L = \nabla^2_{xx}L = \nabla^2_{xx}F - \sum_{i=1}^{m} \lambda_i \nabla^2_{xx}c_i$$

Then, a sufficient condition is that: $$ \boldsymbol{v}^T \boldsymbol{H}_L \boldsymbol{v} > \boldsymbol{0} $$

for any vector $\boldsymbol{v}$ in the constraint tangent space.

If we augmented the problem to include both $\boldsymbol{x}$ and $\boldsymbol{\lambda}$, then why the hessian is only computed with respect to $\boldsymbol{x}$?

Shouldn't we calculate an "augmented" hessian $\boldsymbol{H}_{aug}$:

$$ \boldsymbol{H}_{aug} = \begin{bmatrix} \nabla^2_{xx} & \nabla^2_{x \lambda} \\ \nabla^2_{\lambda x} & \nabla^2_{\lambda \lambda} \end{bmatrix} $$

and then demand that this $\boldsymbol{H}_{aug}$ is positive definite, so that:

$$ \boldsymbol{x}_{aug}^T \boldsymbol{H}_{aug} \boldsymbol{x}_{aug} > \boldsymbol{0} $$

where $\boldsymbol{x}_{aug} = (\boldsymbol{x}^T,\boldsymbol{\lambda}^T )^T$ (transpose, in order to be a column vector).