Prove that $ x^* $ is a local minimum if and only if it is a global minimum.

1.1k Views Asked by At

Consider the quadratic program

$ \min$ $ f(x) $

$ \text{s.t.} \space Ax=c$

Prove that $ x^* $ is a local minimum if and only if it is a global minimum. No convexity is assumed. The back direction is trivial.

Could anyone help with proof? Thanks!

2

There are 2 best solutions below

0
On

If there is a feasible point $x_0$ then the feasible set is given by $\{x_0\} + \ker A$ which can be written as $\{x_0 + Ly \}_y$ for some $L$ with ${\cal R} L = \ker A$.

Hence we can define $\phi(y) = f(x_0+Ly)$, note that $\phi$ is also quadratic and the problem is to show that $y_0$ is a local $\min$ iff $y_0$ is a global $\min$.

Suppose $y_0$ is a local $\min$, then we can write $ \phi(y_0+h) = \phi(y_0) + { 1\over 2}\langle h, Q h \rangle $ and hence it follows that $Q \ge 0$, otherwise $y_0$ would not be an unconstrained local $\min$. (That is, the 'restricted' Hessian of $f$ is positive semi definite). Hence $\phi$ is convex and $y_0$ is a global solution.

Note that the above does not imply that $f$ is convex, for example we could take $f(x) = x_1^2-x_2^2$ subject to $x_2 = 0$.

0
On

Now copper.hat's answer is already very good. However, I aim to answer this question with different notations, just hope to explain it in the plainest language.

First we assume that all the points satisfy $Ax=c$ compose of a feasible region of this optimization problem. It follows that very point in this set is called a feasible point. In particular, we want to show that a feasible point is a local minimum iff it is a global minimum.

Assume $x^*$ is a local minimum. Now, we assume another feasible point $y$. Let $d=y-x^*$, we know that vector $d$ is a feasible direction.

Now consider $Ad$, as you can see, since $Ay = c$ and $Ax^* = c$, we have $Ad = A(y-x^*)=0$. Now, since $x^*$ is a local minimum, $f(x^*+ad)>f(x^*) $ for sufficiently small a.

Now, observe that $\frac{f(x^*+ad)-f(x^*)}{a} $ > $0$ as $a \to 0$, and this equals to $\nabla f(x^*)^T d>0$

Another key observation is $\frac{f(x^*+ad)-2f(x^*)+f(x^*-ad)}{a^2}>0 $, suggesting $\frac{1}{2} d^T \nabla^2 f(x^*) d>0$, now, if this function is of $C^{2}$, you can ensure $d^T \nabla^2 f(x^*+\theta d) d>0$, since you can always scale d by a constant.

By Taylor's Theorem $f(y) = f(x^*+d) = f(x^*) +\nabla f(x^*)^T d +\frac{1}{2} d^T \nabla^2 f(x^*+\theta d) d$, by the above argument $f(y) > f(x^*)$.