Equivalent formulation of LASSO?

623 Views Asked by At

I am currently trying to tell wheter or not those two problems are equivalent :

$$\min_x \|x\|_1 \text { s.t. } \|Ax-y\|^2_2 \le \varepsilon.$$

And

$$\min_x \|Ax-y\|^2_2 \text { s.t. } \|x\|_1\le t.$$

With $x$ and $y$ vectors and $A$ a compatible matrix.


I know that the second problem is equivalent to :

$$\min_x \|Ax-y\|^2_2 + \lambda \|x\|_1.$$

(Lagrangian form)

The same process for the first problem would give :

$$\min_x \|x\|_1 + \mu \|Ax-y\|^2_2$$

Does this lead somewhere ?

Thank for your help.

1

There are 1 best solutions below

1
On BEST ANSWER

The first problem $$\min_x \|x\|_1 \text { subject to } \|Ax-y\|^2_2 \le \epsilon$$ can be described by the equivalent saddle point problem: $$ \min_x\max_{\lambda\ge 0}\left[ \|x\|_1 +\lambda(\|Ax-y\|_2^2-\epsilon)\right]=\max_{\lambda\ge 0}\min_x\left[ \|x\|_1 +\lambda(\|Ax-y\|_2^2-\epsilon)\right], $$ by Lagrange's multiplier theorem. Let $\lambda_0\ge 0$ be the solution of the above problem such that $$ \max_{\lambda\ge 0}\min_x\left[ \|x\|_1 +\lambda(\|Ax-y\|_2^2-\epsilon)\right]=\min_x\left[ \|x\|_1 +\lambda_0(\|Ax-y\|_2^2-\epsilon)\right]. $$If we have $\lambda_0 = 0$, then the minimizer should be $x=0$ and $\|y\|^2_2\leq\epsilon.$ Let us exclude this trivial case and assume that $\lambda_0>0$. By complementary slackness, we should have $\|Ax_0-y\|_2^2 =\epsilon$ where $x_0$ is the solution of the first problem.

In the same manner, the second problem can be equivalently described as $$ \min_{x'}\max_{\mu\ge 0}\left[ \|Ax'-y\|_2^2 +\mu (\|x'\|_1-t)\right]=\max_{\mu\ge 0}\min_{x'}\left[ \|Ax'-y\|_2^2 +\mu (\|x'\|_1-t)\right]\quad\cdots(*), $$ where $t = \|x_0\|_1$. If we look at the formulation of the second problem carefully, we can see that the solution $(\mu_0, x_0')$ should be $(\frac{1}{\lambda_0},x_0)$. To see this, one can plug $\mu=\lambda_0^{-1}$ and $x_0'=x_0$ into the $(*)$ to check if equality holds. In fact, equality holds since given $\mu=\lambda_0^{-1}$, $x=x_0$ is the minimizer and conversely, given $x=x_0$, $\mu =\lambda_0^{-1}$ is the maximizer. This establishes that Lagrangian dual problems in this case (in fact, generally) are equivalent.