Linear Least Squares Problem with Inequality Constraints on Residual

869 Views Asked by At

I have an over-determined linear least-squares problem $$\min_{\vec{x}}\Vert\mathbf{A}\vec{x}-\vec{b}\Vert_2^2,$$ where $\mathbf{A}\in\mathbb{R}^{n\times m}$, $\vec{x}\in\mathbb{R}^m$, $\vec{b}\in\mathbb{R}^n$ and $n>m$.

I can solve that with standard least-square techniques and get an optimal solution $\vec{x}^{*}$. However, I have a specific requirement on the residual vector $\vec{r}\in\mathbb{R}^m$, i.e., $$ \vec{r}=\mathbf{A}\vec{x}^{*}-\vec{b}. $$ I want every component of the residual vector $r_i$ to have the following constraints: $$ \vert r_i\vert <\frac{1}{2}, \text{ with } 1\leq i \leq m. $$

How can I find a solution for $\vec{x}$ that adheres to those constraints?

The matrix $\mathbf{A}$ has three entries per row. The number of rows/equations is in the $10^6$ and the number of columns/unknowns is something like $10^4$ to $10^5$.

1

There are 1 best solutions below

11
On

In general, the condition that every $|r_i| \leq \frac{1}{2}$ can be written as $$ \|\vec{r}\|_\infty \leq \frac{1}{2} $$

So your optimization problem becomes $$ \min_{\vec{r}} \quad \|\mathbf{A} \vec{x} - \vec{b}\|_2 \text{ subject to } \| A \vec{x} - \vec{b} \|_\infty \leq \frac{1}{2} $$

Any standard convex solver can solve such problems, e.g. CVX or YALMIP.

Update

If your generic convex solver cannot efficiently handle the sparsity structure and solves the problem slowly, there are several alternatives. Here are some of them.

Conditional Gradient (Frank Wolfe)

Solves problems of the form $\min_{x \in C} f(x)$, where $C$ is a compact set and $f$ is continuously differentiable, and has a Lipschitz continuous gradient. In your case, you can define $C = \{ x : \|A x - b\|_\infty \leq 0.5 \}$ and $f = \|A x - b\|_2^2$.

The algorithm follows the following iterative steps:

  1. Direction Finding - find $s_k$ which minimizes $s_k^T \nabla f(x_k)$ subject to $s_k \in C$. In our case, it reduces to a linear program. Although it is quite large, solvers like Gurobi can handle it quite quickly.
  2. Set $t = \frac{2}{k+2}$ and $x_{k+1} = x_k + t(s_k - x_k)$.

Alternatively, in step (2) you can perform a line-search and find $t \in [0, 1]$ which minimizes $f(x_k + t(s_k - x_k)$. This is a simple minimization problem of minimizing a parabola over an interval.

Solve a Dual

Your problem can be, by substituting auxiliary variables $r = A x - b$, written as: $$ \min_{x, r} \quad \|A x - b\|_2^2 \text{ s.t. } r = A x - b, \|r\|_\infty \leq 0.5 $$ A Lagrangian with multipliers for the equality constraints is $$ L(x, r; \lambda) = \|A x - b\|_2^2 + \lambda^T(A x - b - r) $$ A dual can be obtained by $$ q(\lambda) = \min_{x, r} L(x, r; \lambda) = \min_x \{ \| A x - b\|_2^2 + (A^T \lambda)^T x \} + \min_r \{ -\lambda^T r :\|r\|_\infty \leq 0.5 \} - \lambda^T b $$ The first minimum is as easy as least-squares - just compare its gradient to zero. You obtain $$ A^T (A x - b) + A^T \lambda = 0 $$ Assuming your $A^T A$ is invertible, you obtain an explicit formula for $x^*$ as a function of $\lambda$, and an explicit formula for the first minimum. I will leave it to you, but it is a quadratic function of $\lambda$, which I denote as $q_0(\lambda)$. The second minimum is, by Cauchy-Schwartz, $-\|\lambda\|_1$. Hence, your dual problem is maximizing $q_0(\lambda) -\|\lambda\|_1- \lambda^T b$, or written as minimum, it is $$ \min_{\lambda} -q_0(\lambda) + \|\lambda\|_1 + \lambda^T b $$ Since $q_0$ is a concave quadratic function of $\lambda$, it becomes a simple Lasso problem. There is a vast variety of algorithms to solve it, including, for example, FISTA. Having solved it and found the optimal $\lambda^*$, you can use the formula of $x^*$ as a function of $\lambda^*$ we derived to obtain your optimal $x^*$.

Fast Dual Proximal Gradient

FDPG is a direct first order method for solving convex problems of the form $f(x) + g(A x)$, where $g$ is a convex extended real-valued function (can have $+\infty$ as its value). In your case, we set $f(x) = \|A x - b\|_2^2$ and $$ g(x) = \begin{cases} 0 & \|x - b\|_\infty \leq 0.5 \\ +\infty & \end{cases} $$ The algorithm requires knowledge of the strong-convexity parameter of $f$, which in your case the minimum eigenvalue of $A^T A$. And also requires being able to compute the proximal mapping of $g$, which in your case is the projection onto the ball $\{ x : \|x - b\|_\infty \leq 0.5 \}$ which is quite easily done. In each iteration, you will have to solve a system of linear equations whose associated matrix is $A^T A$. You can Cholesky-Factor it once before running the algorithm, to make each iteration efficient.

Have fun solving!