Closed-form solution of $\mathbb{R}^2$ LASSO?

241 Views Asked by At

The problem:

$$\min_{x, \, y} \quad (ax + by + 1)^2 + |x| + |y| + (x-1)^2 + (y-1)^2$$

where all variables are in $\mathbb{R}$. Can I find the optimal ($x_p,y_p$) in terms of $a$ and $b$?

Edited: I have tried two ways:

  1. One way is to set the (sub-) gradients to zero: $$ \begin{cases} 2a(ax_p + by_p + 1) + \text{sign}(x_p) + 2(x_p-1) = 0 \\ 2b(ax_p + by_p + 1) + \text{sign}(y_p) + 2(y_p-1) = 0 \end{cases} \quad\Rightarrow\quad \begin{bmatrix} x_p \\ y_p \end{bmatrix} = \begin{bmatrix} 2a^2 + 2 & 2ab \\ 2ab & 2b^2 + 2 \end{bmatrix}^{-1} \begin{bmatrix} 2 - 2a - \text{sign}(x_p)\\ 2 - 2b - \text{sign}(y_p)\\ \end{bmatrix} $$ For convex problems (as in our case) this is the optimal. However the sign tangles into the final expression.

  2. Another way is to split the problem as: $$\min_{x, \, y} \quad \underbrace{(ax + by + 1)^2 + (x-1)^2 + (y-1)^2}_{f(x,y)} + \underbrace{|x| + |y|}_{g(x,y)} $$ Geometrically, the optimal $(x_p,y_p)$ would be a convex combination of $(x_0,y_0)$ (the optimal of $f$) and the origin $(0,0)$ (the optimal of $g$). That said $$(x_p,y_p) = (t_x x_0, t_y y_0)$$ where $t_x, t_y$ are in $[0,1]$. $(x_0,y_0)$ can be easily obtained by letting $\nabla f(x,y) = 0$: $$ \begin{bmatrix} x_0 \\ y_0 \end{bmatrix} = \begin{bmatrix} 2a^2 + 2 & 2ab \\ 2ab & 2b^2 + 2 \end{bmatrix}^{-1} \begin{bmatrix} 2 - 2a\\ 2 - 2b\\ \end{bmatrix} $$ For $(x_p,y_p)$ obtained from Method 1, the tangled sign can be determiend from $(x_0,y_0)$.

1

There are 1 best solutions below

6
On BEST ANSWER

I believe the closest to a closed-form solution you can get is using duality. Taking $$ \mathbf{z} = (x, y)^T, \quad \mathbf{b} = (-1, 1, 1)^T , \quad \mathbf{A} = \begin{pmatrix}a&b \\ 1&0 \\ 0&1\end{pmatrix} $$ your problem is $$ \min_{\mathbf{z}} \quad \|\mathbf{A} \mathbf{z} - \mathbf{b}\|_2^2 + \|\mathbf{z}\|_1. $$ Adding the constraint $\mathbf{w}=\mathbf{z}$ we obtain $$ \begin{aligned} \min_{\mathbf{z}} &\quad \|\mathbf{A} \mathbf{z} - \mathbf{b}\|_2^2 + \|\mathbf{w}\|_1 \\ \text{s.t.} &\quad \mathbf{w}-\mathbf{z}=0 \end{aligned} $$ The Lagrangian is $$ L(\mathbf{z}, \mathbf{w}; \mathbf{u}) = \|\mathbf{A} \mathbf{z} - \mathbf{b}\|_2^2 + \|\mathbf{w}\|_1 + \mathbf{u}^T(\mathbf{w}-\mathbf{z}). $$ Therefore, the dual objective is $$ q(\mathbf{u}) = \min_{\mathbf{z}} \{ \|\mathbf{A} \mathbf{z} - \mathbf{b}\|_2^2 + \mathbf{u}^T \mathbf{z} \} + \min_{\mathbf{w}} \{ \|\mathbf{w}\|_1 - \mathbf{u}^T \mathbf{w}\} $$ The second optimization problem has a very simple solution, the indicator function of the set $\{ \mathbf{u} : \|\mathbf{u}\|_{\infty} \leq 1\}$. The rightmost optimization problem is a convex quadratic problem whose optimal solution, obtained by $\nabla = 0$, is $$ \mathbf{z}^* = \frac{1}{2}(\mathbf{A^T A})^{-1}(2 \mathbf{A}^T \mathbf{b} - \mathbf{u}) $$ and the optimal value is $$ -\frac{1}{4} (\mathbf{u} - 2\mathbf{A}^T \mathbf{b})^T (\mathbf{A}^T \mathbf{A})^{-1} (\mathbf{u} - 2\mathbf{A}^T \mathbf{b}) - \|\mathbf{b}\|_2^2. $$ So the dual problem, up to constants, is $$ \begin{aligned} \max_{\mathbf{u}} &\quad -(\mathbf{u} - 2\mathbf{A}^T \mathbf{b})^T (\mathbf{A}^T \mathbf{A})^{-1} (\mathbf{u} - 2\mathbf{A}^T \mathbf{b}) \\ \text{s.t.} &\quad -1 \leq u_1 \leq 1 \\ &\quad -1 \leq u_2 \leq 1 \end{aligned} $$

The dual is easily solved by KKT by solving 16 linear equations which arise from the signs of the Lagrange multiplies (zero or nonzero). The primal solution is obtained from the dual optimal $u^*$ by the formula found earlier: $$ \mathbf{z}^* = \frac{1}{2}(\mathbf{A^T A})^{-1}(2 \mathbf{A}^T \mathbf{b} - \mathbf{u}^*) $$

Important

Note, that when solving the dual, you can avoid inverting matrices altogether. Just explicitly write the equations, and use the fact that you already know the inverse of $(\mathbf{A}^T \mathbf{A})^{-1}$.