Beyond LICQ in KKT problem: can't I study irregular points as follows?

923 Views Asked by At

Given the following minimization problem $$min f(x,y)=-x-y$$ in the region described by $ x-4y+3 \geq 0$, $-x+y^2 \geq 0$, $x \geq 0$,

I notice that the origin is an irregular point. In fact, gradients of constraints, evaluated in the origin, are $$\nabla g_1(x,y) = (1,-4) $$ $$\nabla g_2(x,y) = (-1,2y) = (-1,0)$$ $$\nabla g_3(x,y) = (1,0)$$.

If I consider the Jacobian matrix given by $ ( \nabla g_2 \nabla g_3 ) $ (g2 and g3 are the active constraints in the origin), I notice that I don't have full rank, so the gradients are linearly dependent (LICQ violated! Linear independence constraint qualification: the gradients of the active inequality constraints and the gradients of the equality constraints are linearly independent). The exercise solution just states that in this case, we go on because we can't apply KKT conditions, so we aren't able to exclude that (0,0) is a point of minimum.

However, I tried to follow the proceeding that many use to explain the geometrical meaning of KKT conditions and I think I can say something more.

enter image description here

$g_1$ and $g_2$ are the constraints gradients (let's ignore their amplitude). If (0,0) is a min point, then there's no descent direction that is also feasible. In (0,0) the only feasible directions (I need to stay in the region) are $d_1$ and $d_2$. If I want to "forbid" descent feasible direction, I need to force the gradient to belong to x axis. In that way (and this is the only case!) both of feasible directions aren't descent directions. In the exercise situation, $\nabla f$ is $(-1,-1)$ so we just conclude that 0 is not a point of minimum, because one of the feasible directions defines with $\nabla f$ an obtuse angle (i.e. descent direction).

I tried to generalize the reasoning further, I came to the conclusion (probably false!) that when I deal with irregular points, I should consider the cone generated by the gradients. If the cone is not well defined (i.e. the gradients are aligned but opposite) and I have two cones/half-planes generated by them, then I should consider the intersection between half-planes that contain at least a feasible direction. You won't believe me, but I found a variety of cases in which the rule seems to work. Is it a correct idea?

Thanks a lot

1

There are 1 best solutions below

14
On BEST ANSWER

You need to be a little careful here.

With the problem $\min \{ f(x) | g_k(x) \le 0 \}$, if $x$ is a local optimum then there are non negative multipliers, not all zero, such that $\mu_0 \nabla f(x) + \sum_k \mu_k \nabla g_k(x) = 0$ (Fritz John).

The LICQ are sufficient conditions that guarantee that $\mu_0 \neq 0$, in this case they do not hold at $(0,0)$ so the conclusion doesn't apply, and, from an exercise perspective, you are done.

The optimality conditions are first order, so, loosely, they do not distinguish problems that have the same gradients at a point.

Hence the problems $P_1: \ \min \{ -x-y | -x \le 0, x - y^2 \le 0 \} $ and $P_2: \ \min \{ -x-y | -x +2y|y|\le 0, x - y^2 \le 0 \} $ 'look' the same from a first order optimality condition perspective, however, $(0,0)$ is not a local minimum for $P_1$, but is a local minimum for $P_2$.

In particular, from gradient information alone, you cannot exclude $(0,0)$ as a potential local minimum.

However, we can do a little better if we include more information. Let $C = \{ x | g_k (x) \le 0 \}$ (the feasible set) and suppose $x \in C$, then we can define the cone of feasible directions at $x$ as $D_C(x) = \{ h | \exists \delta>0, x+th \in C \ \forall t \in [0,\delta] \}$.

For $P_1$, we have $D_C((0,0)) = \{ (0,t) \}_{t \in \mathbb{R} }$ and for $P_2$, we have $D_C((0,0)) = \{ (0,t) \}_{t \le 0 }$.

Then we see if there is a direction of descent for $f$ contained in $D_C((0,0))$.

Since $\langle \nabla f((0,0)), (0,1) \rangle < 0$, we see that $(0,0)$ cannot be a local minimum for $P_1$, however since $\langle \nabla f((0,0)), h \rangle \ge 0$ for all $h \in D_C((0,0))$, we cannot exclude $(0,0)$ as a local minimum for $P_2$.

There are many generalisations and variants of the constraint tangent cone.

Also note that you need to look at the dual cone of the convex cone formed by the active gradients, not the cone itself.