How is the gradient being perpendicular to the constraints mean that the function is minimized?

1k Views Asked by At

I have a professor in optimization class who says that, in the interior point method, we of course cannot use the gradient to determine that we are at the minimum point. This is because the problem is constrained by $Ax=b$.

Instead, we use the fact that "being at the minimum implies the weaker condition that the gradient of the objective function is perpendicular to the constraints $Ax=b$".

I'm having trouble following that reasoning...as I can think of many LP problems where the gradient is not perpendicular to the constraints at the minimizer. How is this true?

2

There are 2 best solutions below

5
On BEST ANSWER

I'm not sure you're claim is accurate. In any case, what we know in general is this: Given a function $f: \mathcal H \rightarrow (-\infty,+\infty]$, recall the definition of its sub-differential at a point $x \in \mathcal H$, namely

$$ \partial f(x) := \{v \in \mathcal H | f(z) \ge f(x) + v^T(z-x)\}. $$ An element of $\partial f(x)$ is called a sub-gradient of $f$ at $x$. The following classical result is immediate. Viz,

First-order optimality condition: $x^* \in \text{argmin }f \iff 0 \in \partial f(x^*).$

Now, minimizing $h(x)$ subject to $x \in C$ is equivalent to minimizing $\;f(x):= h(x) + i_C(x)$ without any constraints. Thus,

$$ \begin{split} x^* \in \text{argmin }h|_{\mathcal C} &\iff x^* \in \text{argmin }h + i_{\mathcal C} \iff 0 \in \partial (h + i_{\mathcal C}) \iff -\nabla h(x^*) \in \partial i_{\mathcal C} \\ &\iff \langle \nabla h(x^*), z - x^*\rangle \ge 0\; \forall z \in \mathcal C, \end{split} $$ which gives something which roughly amounts to your claim, when the constraint set $\mathcal C$ is affine.

N.B.: $i_C$ denotes the indicator function of $C$ defined by $i_C(x)= 0$ if $x \in C$ and $i_C(x) = \infty$ otherwise.

0
On

I believe a more precise term would be the gradient is perpendicular to a supporting hyperplane of the feasible set.

When the feasible set is a smooth manifold (without "corners"), then the supporting hyperplane is a tangent and in this case the gradient is indeed perpendicular.

When the feasible set has a corner, there are many supporting hyperplanes. For example, basic feasible solutions in LP problems are "corners", and thus at the optimum the objective gradient is perpendicular to one of the infinite set of supporting hyperplanes.