Local optima of non-negative-least-squares?

43 Views Asked by At

For a fixed matrix $A \in \mathbb{R}^{n \times m}$ and a vector $b \in \mathbb{R}^{n}$ the non-negative-least-square problem is given by

$$\underset{x \in \mathbb{R}^m_{\geq 0}}{\arg \min} \ g(x), \ \ \ g(x) = \vert \vert Ax-b\vert\vert^2.$$

Now, consider the Lagrangian $$L(x,\mu) = g(x) + \mu^T x.$$

By the Karush-Kuhn-Tucker conditions, we have the following necessary conditions for $x^*$ to be a local optimum: There should exist $\mu \leq 0$ such that

  • $ \nabla_x L(x^*,\mu) = 0$

  • $ x^* \geq 0$

  • $ {x^*}^T \mu = 0.$

The first condition reads $$ A^T (Ax - b) +\mu = 0.$$ Together with the other conditions, this should yield the following property of entries $i$

$$(A^TAx^* - A^T b)_i= 0, \text{ if } {x^*}_i >0 $$

$$ (A^TAx^* - A^T b)_i= \mu_i \text{ else }.$$

That is, in some sense the possible optimum $x^*$ solves the equation for some entries, and fails to do so for other entries.

Now, according to wikipedia, this is a convex problem with a convex feasible set, so if I understand this correctly (I never really learned about optimization with inequality constraints) then there should be only one one global optimum of the solution.

There still could however exist other ${x^*}$

that fulfill the above conditions. I am wondering, since it might be a problem that can be reasonably well understood, whether one can understand these points ${x^*}$.

Why do they fail to be local-minima? Can we still interpret them in some meaningful way related to our objective? Or is my reasoning wrong all along?

1

There are 1 best solutions below

0
On BEST ANSWER

Broadly speaking, with convex problems, necessary conditions are sufficient.

A nice characterisation of an optimal solution is: If the cost $f$ is convex & differentiable and the constraint set $C$ is convex, then $x^* \in C$ solves $\min_{x \in C} f(x)$ iff $df(x^*, x-x^*) \ge 0$ for all $x \in C$.

In terms of the above problem, if $x^*$ satisfies the above conditions, we have, for any $x \ge 0$, $\langle \nabla g(x^*), x-x^* \rangle = - \langle \mu^*, x-x^* \rangle = - \langle \mu^*, x \rangle \ge 0$, and so $x^*$ is a solution.