In machine learning and statistics, why does the lasso path slope down to the right?

106 Views Asked by At

In Lasso regression, for a sparse estimate of coefficients $\beta$, we have:

$$ \hat{\beta}(\lambda) = \arg \min_b \Bigl\{\frac{1}{2} \|y-Xb\|^2_{2} + \lambda\|b\|_1\Bigl\} $$

One graph I saw that plotted out the coefficient values of $\beta$ vs. the $\lambda$ parameter is:

enter image description here

My question is why the graphs slope downwards to zero? In other words, why is it that when we increase $\lambda$, our coefficient estimates tend to zero? I did a thought experiment where I let the $\lambda||b||_1$ term get large, but I fail to see the connection. In other words:

1) Is it the case Lasso Paths always slope to zero as $\lambda$ gets large?

2) Do coefficient values always start positive? (if $\lambda = 0$, then we have OLS).

3) What is the intuition here?

Thanks!