Why is the $L_2$ regularization squared while the $L_1$ is not?

66 Views Asked by Bumbble Comm At 27 Mar 2026 - 4:18

Best example to demonstate my question is the elastic net, which has the Risk (here for linear regression). For some $D=\{(x_i,y_i)\}_{i=1}^n$ with $ x_i\in \mathbb{R}^d, y_i\in\mathbb{R}$ and some $ \lambda_1, \lambda_2 \geq 0$.

$$R(w) = \sum_{i=1}^n (w^Tx_i -y_i)^2 + \lambda_1 ||w||_1 + \lambda_2 ||w||_2^2$$

Why is the $L_2$ norm squared, just because of the derivative, or is there some reason behind this?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 30 Aug 2021 - 9:41 BEST ANSWER

I asked this question also on a forum of my university. One thing an assistant mentioned:

The norm $||w||_1$ is linear in $|w_i|$ and $||w||^2$ is linear in $|w_i|^2$. Which makes the non-squared $L_1$ and squared $L_2$ norms computationally and theoretically more interesting and easier to handle.

Note that $||w||_1^2 = ||w||_2^2 + \sum_{j}\sum_{i\neq j}|w_j|\,|w_i|$ is not linear in $|w_i|$ and also $||w||_2$ itself is not linear in $|w_i|$. Thus squared $L_1$ and and non-squared $L_2$ are hard to handle.

Why is the $L_2$ regularization squared while the $L_1$ is not?

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in NORMED-SPACES

Related Questions in REGULARIZATION

Trending Questions

Popular # Hahtags

Popular Questions