LASSO's (or BPDN) parameter tuning

81 Views Asked by At

We have to solve the following problem :

$$\min_x \|x\|_1 \text{ s.t. } \|Ax-y\| \le \sigma,$$

with $\sigma$ some positive real number, $A$ a complex rectangular matrix and $x,y$ complex vectors.

We know that, for each $\sigma$ in some interval of the form $(a,b), 0<a<b$, there exists a unique solution $x_\sigma$. The function $f:\sigma \to \|x_\sigma\|_1$ is known to be continuously differentiable and convex on $(a,b)$ (Article). However, I recall that a teacher told me that taking the smallest $\sigma \in (a,b)$ such that $f''(\sigma)\approx 0$ is often a good choice. And indeed that seems to be the case for the data I am considering, in the sense that the model built by this solution is (almost) the closest to the true solution.

However, this is not my research domain and I am unable to find a reference on that. Can someone provide one ? Or at least some insights about this ?

Thank you.

EDIT : the proposed similar question is different as I am asking here for a criterion to chose the error tolerance $\sigma$. Writing the problem in the other formulation (i.e. $\min_x \|Ax-y\|^2_2 + \lambda \|x\|_1.$ or another one) just shifts the problem as another parameter ($\lambda$ or else) needs to be tuned.

1

There are 1 best solutions below

2
On BEST ANSWER

After some time and research on my own, I have some understanding (to some extent) of my own question. Some of them below.

Tykhonov's regularization

The problems belonging to this field can be written as :

$$\min_x \| Ax-y\|^2+\|\Gamma x\|^2$$

In the general formulation, $\Gamma$ is a matrix. However, a lot of people work with a particular verison of that problem, where $\Gamma$ is a real number, or a matrix of the form $aI$. If we look at the curve parametrized by $a\mapsto (\|x(a)\|,\| Ax(a)-y\|)$, $x(a)$ being the solution of the Tykhonov's problem with $\Gamma = aI$, a good criterion is to chose $a$ so that we are at the maximum curvature point of this aforementioned curve.

Let $r(a) = Ax(a)-y$ the function giving the residual and $n(a) =\|x(a)\|$, the formula of the curvature gives us :

$$\kappa(a) =\frac{n'(a)r''(a)-r'(a)n''(a)}{(n'(a)^2+r'(a)^2)^{\frac 3 2}}.$$

And a theorem caracterizing the solution of convex problems, such as the Tykhonov's ones, tells us that $r(a)=\|x(a)\|=a$. Thus :

$$\kappa(a) =\frac{n''(a)}{(n'(a)^2+1)^{\frac 3 2}}.$$

Looking at the L-curve, we understand that $n''(a)$ will be close to $0$ before a bump and being close to $0$ again when $a$ decreases. Because of the formula above, curvature will roughly have the same behaviour. Thus, the smallest $a$ such that $r''(a) \approx 0$ nearly coincides with the point of maximum curvature, hence justifying the "good choice".

For $L_1$ minimization

All the previous computations still holds and we just need to replace $n$ by $f$, however the shape of the curve is more smooth. And using the curvature instead of the heuristic formula becomes necessary. Several articles have successfully applied the maximum curvature criterion on those types of problems.