For a fixed matrix $A \in \mathbb{R}^{n \times m}$ and a vector $b \in \mathbb{R}^{n}$ the non-negative-least-square problem is given by
$$\underset{x \in \mathbb{R}^m_{\geq 0}}{\arg \min} \ g(x), \ \ \ g(x) = \vert \vert Ax-b\vert\vert^2.$$
Now, consider the Lagrangian $$L(x,\mu) = g(x) + \mu^T x.$$
By the Karush-Kuhn-Tucker conditions, we have the following necessary conditions for $x^*$ to be a local optimum: There should exist $\mu \leq 0$ such that
$ \nabla_x L(x^*,\mu) = 0$
$ x^* \geq 0$
$ {x^*}^T \mu = 0.$
The first condition reads $$ A^T (Ax - b) +\mu = 0.$$ Together with the other conditions, this should yield the following property of entries $i$
$$(A^TAx^* - A^T b)_i= 0, \text{ if } {x^*}_i >0 $$
$$ (A^TAx^* - A^T b)_i= \mu_i \text{ else }.$$
That is, in some sense the possible optimum $x^*$ solves the equation for some entries, and fails to do so for other entries.
Now, according to wikipedia, this is a convex problem with a convex feasible set, so if I understand this correctly (I never really learned about optimization with inequality constraints) then there should be only one one global optimum of the solution.
There still could however exist other ${x^*}$
that fulfill the above conditions. I am wondering, since it might be a problem that can be reasonably well understood, whether one can understand these points ${x^*}$.
Why do they fail to be local-minima? Can we still interpret them in some meaningful way related to our objective? Or is my reasoning wrong all along?
Broadly speaking, with convex problems, necessary conditions are sufficient.
A nice characterisation of an optimal solution is: If the cost $f$ is convex & differentiable and the constraint set $C$ is convex, then $x^* \in C$ solves $\min_{x \in C} f(x)$ iff $df(x^*, x-x^*) \ge 0$ for all $x \in C$.
In terms of the above problem, if $x^*$ satisfies the above conditions, we have, for any $x \ge 0$, $\langle \nabla g(x^*), x-x^* \rangle = - \langle \mu^*, x-x^* \rangle = - \langle \mu^*, x \rangle \ge 0$, and so $x^*$ is a solution.