Fastest Algorithm for NLP with linear constraints

379 Views Asked by At

I have an minimization problem of the following form:

(Im not a mathematician, i come from the programming side, so excuse me if i have not the perfect standard of writing the formulas)

$Z(x) = \sqrt[2]{x_1^2 \cdot a_1+...+x_n^2 \cdot a_n} + x_1 \cdot b_1+...+x_n \cdot b_n$

with the constraints:

(1) $\sum_{i=1}^n{x_i} =1$

(2) $0<x_i<=c_i \ \text{with} \ c = \{c_1,..,c_n\}$

Im interested in an algorithm that finds a local minimum as fast as possible, since i want to implement it.

I have already read a lot of stuff with different approaches for different type of constraints, but they mostly didnt fit or where hard to understand, so i could not be sure if they fit.

I understand that this is a NLP with linear constraints, but since im pretty new to NLP i hope for someone with more experience who can help me. Thank you!

2

There are 2 best solutions below

6
On

If the $a_i$ values are negative, all bets are off; it is a non-convex problem. But if the $a_i$ values are nonnegative, this is a convex problem, and therefore it is tractable. In fact, it can easily be converted to a second-order cone program, or SOCP, which is a relatively well known modern standard form for convex optimization.

To convert to a "standard" SOCP, define $\bar{a}_i=\sqrt{a}_i$, and create a new variable $y\in\mathbb{R}$. Then this problem is equivalent to $$\begin{array}{ll} \text{minimize} & \sum_i b_i x + y \\ \text{subject to} & \sqrt{\textstyle \sum_i (\bar{a}_i x_i)^2 } \leq y \\ & \sum_i x_i = 1 \\ & 0 \leq x_i \leq c_i, ~ i=1,2,\dots,n \end{array}$$ The key here is that the objective is linear, and the constraints are composed of linear equations and inequalities, and convex constraints involving norms. No other types of nonlinearities are permitted in an SOCP other than convex $\ell_2$-norm constraints.

Here is a good online resource on convex optimization, with a section on second-order cone programming. There are a variety of solvers out there that can solve SOCPs. If you don't mind me offering my own software, the MATLAB-based modeling framework CVX can handle this quite readily; you don't even need to do the SOCP conversion. Here would be the model in CVX, that assumes the values of $a_i$, $b_i$, and $c_i$ are stored in MATLAB vectors a, b, and c:

cvx_begin
    variable x(n)
    minimize( norm(sqrt(a).*x) + b'*x )
    subject to
        sum(x) == 1;
        0 <= x <= c;
cvx_end

CVX will certainly not give you the fastest results; it is meant to make problems easy to specify. But you can worry about speed after you're satisfied with the results.

0
On

First, a couple minor points. I'm assuming $a_i > 0$ so the problem is convex, and $\sum_{i=1}^n c_i \ge 1$, so the problem is feasible. Also, there is no guarantee that there will be a minimum with $x_i$ strictly positive, so the constraints should be $0 \le x_i \le c_i$. If it's unacceptable in your application for the $x_i$ to be zero, then you should specify some minimum nonzero value for them.

While I'm not sure what the best algorithm for this problem would be, here's at least a partial solution. For this particular problem, you can almost find a closed form solution. That is, you can reduce it to searching over 2 parameters.

Let $C$ be the feasible region (intersection of a box and a plane): $$ C := \{ \mathbf{x} \in \mathbb{R}^n \ \mid \ 0 \le x_i \le c_i, \ \sum_{i=1}^n x_i =1 \} $$ In this particular case, it's relatively easy to project onto $C$. By introducing another parameter, you can see that the optimal $\mathbf{x}$ will be a (scaled) projection onto $C$. This uses the standard trick that $\sqrt{w} = \min_{s>0} \frac{s}{2} + \frac{w}{2s}$. $$ \begin{aligned} \min_{\mathbf{x} \in C} \sqrt{ \sum_{i=1}^n a_i x_i^2} + \sum_{i=1}^n b_i x_i &= \min_{\mathbf{x} \in C} \min_{s > 0} \frac{s}{2}+ \frac{1}{2s}\sum_{i=1}^n a_i x_i^2 + \sum_{i=1}^n b_i x_i\\ &= \min_{s > 0} \frac{s}{2} +\frac{1}{2s} \left(\min_{\mathbf{x} \in C}\sum_{i=1}^n a_i x_i^2 + 2 s b_i x_i \right) \\ &= \min_{s > 0} \frac{s}{2}\left(1 - \sum_{i=1}^n b_i^2 a_i^{-1} \right) +\frac{1}{2s} \left(\min_{\mathbf{x} \in C}\sum_{i=1}^n a_i (x_i + s b_i a_i^{-1})^2 \right) \end{aligned} $$ I'll skip the derivation of the formula for the scaled projection onto $C$, but it ends up being the following. Define $\mathbf{y}(s,t)$ by: $$ y_i(s,t) := \max(0,\min(c_i,(t-b_i s)/a_i)) $$ Then the optimal solution to the problem is $\mathbf{x}^* = \mathbf{y}(s^*,t^*)$, where the optimal parameters $s^*$ and $t^*$ are chosen to satisfy: $$ \begin{gathered} \sum_{i=1}^n y_i(s^*,t^*) = 1\\ \sqrt{\sum_{i=1}^n a_i y_i(s^*,t^*)^2} = s^* \end{gathered} $$ So then the problem reduces to finding $s^*$ and $t^*$. For fixed $s$, you can solve for $t$ which satisfies $\sum_{i=1}^n y_i(s,t) = 1$ in $O(n)$ time. This paper has the case for the unit simplex, see the references on the linked page for the general case.

Unfortunately, I'm not sure what the best approach is for finding both $s^*$ and $t^*$, though I'd expect somewhere in the CS literature there's an algorithm with this problem as a special case which is runs in $O(n)$ or $O(n^2)$ time. If you just want to use a simple bisection approach, you can use the following bounds to get your starting iterations: $$ \begin{gathered} \min_i \sqrt{a_i} \le s^* \le \max_i \sqrt{a_i} \\ \min_i b_i s^* \le t^* \le \max_i b_i s^* + c_i \end{gathered} $$