KKT conditions on minimization problem

371 Views Asked by At

I am trying to get an explicit solution to the following problem with the help of KKT conditions. But I am stuck.

The problem:

$ min_x 1/2 ||y-x||^2_2 + \lambda||x||_1 $

This is what I have done so far:

I have applied stationarity of KKT to get:

$ 0 = - (y-x) + \lambda S $

$ S_i = sign(X_i) \ if X_i = 0$ else, $ S_i \in [-1,1] \ if \ X_i = 0 $

Looking at the $i^{th} $ component of this I see:

$ y -x = \lambda S_i \ \ i = 1 .....r$

Looking at this we can also conclude that

If $ |y-x| < \lambda $ then $ X_i = 0$

and,

$|y-x| = \lambda$ then $ S_i = +- 1 $

What do I do after this in order to get an explicit solution for the problem?

1

There are 1 best solutions below

2
On BEST ANSWER

I am assuming $\lambda \ge 0$.

For $y \in \mathbb{R}$, define $\phi_y: \mathbb{R} \to \mathbb{R}$ by $\phi_y(x) = \frac{1}{2}(x-y)^2 + \lambda |x|$.

Note that the above problem is equivalent to $\min_x \sum_i \phi_{y_i}(x_i)$, so the problem is separable and we can focus on solving $\min_x \phi_y(x)$ where $x \in \mathbb{R}$. (Since $\lim_{x \to \infty} \phi_y(x) = \infty$, it is clear that the problem has a solution.)

If we let $\alpha(x) = |x|$ we have $\partial \alpha(x) = \begin{cases} \{ \operatorname{sgn} x \}, & x \neq 0 \\ [-1,1], & x = 0\end{cases}$.

At a solution $\hat{x}$ we have $0 \in \partial \phi_y(\hat{x})$. Since the cost is regular, we have $0 \in \{\hat{x}-y\} + \lambda \partial \alpha(\hat{x})$, or $y=\hat{x}+\lambda \xi$, where $\xi \in \partial \alpha(\hat{x})$.

If $\hat{x} = 0$, we see $|y| \le \lambda$, and if $\hat{x} \neq 0$, we see that $y = \hat{x} + \lambda \operatorname{sgn} \hat{x} = \operatorname{sgn} \hat{x} (|x|+\lambda)$, and so $|y| > \lambda$.

It follows that if $|y| \le \lambda$, then $\hat{x} = 0$, and if $|y|>\lambda$, then $\hat{x} = y-\lambda \operatorname{sgn} \hat{x}=y-\lambda \operatorname{sgn} y= (\operatorname{sgn}y)(|y|-\lambda)$.

These cases can be combined to get $\hat{x} =(\operatorname{sgn}y)\max(|y|-\lambda,0)$.

Hence the solution to the original problem is given by $\hat{x}_i =(\operatorname{sgn}y_i)\max(|y_i|-\lambda,0)$.