Linear independence of equality constraint gradients in constraint qualifications

6.9k Views Asked by At

I'm, trying to get an intuitive feel for the various constraint qualifications for KKT points. Most of them seem to rely on the linear independence of $\nabla g_i(x^*)$ where $g_i$ are the equality constraints. The book doesn't really state why.

The first KKT condition states

$\nabla f(x^*) + \sum\mu_i\nabla g_i(x^*) + \sum \lambda_j\nabla h_j(x^*) = \textbf{0}$

A hazy first guess is that if the gradients were to be linearly dependent, then any choice of $\lambda$ could potentially satisfy the condition, thus producing 'trivial' KKT points. We need to ensure that the term associated with the equality constraints only vanishes for $\lambda_j \equiv 0 $.

I think this is somewhat in analogue to the situation with the $\mu$ multiplier potentially being zero for $\nabla f(x^*)$ in the Fritz-John conditions.

Self-studying is hard :) Am I anywhere close here?

1

There are 1 best solutions below

0
On

With an eight month delay:

Let me first state that I'm not a pro and a self-student myself which might ease our conversation.

What I noted is that we seem to use different definitions of the linear independence constraint qualification. Nocedal/Wright's Numerical Optimization (1999, 1E) states in

Definition 12.1 (LICQ). Given the point $x^*$ and the active set $\mathcal{A}(x^*)$ defined by (12.29), we say that the linear independence constraint qualification (LICQ) holds if the set of active constraint gradients $\{\nabla c_i(x^*), \in \mathcal{A}(x^*)\}$ is linearly independent.

Definition (12.29), in turn, says that the active set comprises of the indices of the equality constraints and the active inequality constraints. (The $c_i$ in the above definition include equality and inequality constraints likewise.)

The Lagrange's stationarity (given by you above) plus the requirements that

  • it holds for active constraints only
  • the active inequality constraints´ coefficents need to be non-negative
  • is equivalent to stating that in a vicinity of $x^*$ there's no (feasible) point that evaluates the objective function smaller than $x^*$.

    Where the LICQ guarantees that any sequence of feasible points that converges towards $x^*$ has the property $f(z_k) > f(x^*)$ for sufficiently large $k$ or in a vicinity of $x^*$, respectively.

    But I must admit that it's not totally clear to me why the LICQ is required for a $1^{\text{st}}$-order necessary optimality condition for constrained problems (see this post).