Huber loss vs l1 loss

4.8k Views Asked by firdaus At 06 Jun 2025 - 8:11

From a robust statistics perspective are there any advantages of the Huber loss vs. L1 loss (apart from differentiability at the origin) ? Specifically, if I don't care about gradients (for e.g. when using tree based methods), does Huber loss offer any other advantages vis-a-vis robustness ?

Moreover, are there any guidelines for choosing the value of the change point between the linear and quadratic pieces of the Huber loss ? Thanks.

Original Q&A

There are 1 best solutions below

Mustafa Eisa On 07 Apr 2016 - 12:21

The Huber function is less sensitive to small errors than the $\ell_1$ norm, but becomes linear in the error for large errors. To visualize this, notice that function $| \cdot |$ accentuates (i.e. becomes sensitive to) points near to the origin as compared to Huber (which would in fact be quadratic in this region). Therefore the Huber loss is preferred to the $\ell_1$ in certain cases for which there are both large outliers as well as small (ideally Gaussian) perturbations.

The point of interpolation between the linear and quadratic pieces will be a function of how often outliers or large shocks occur in your data (eg. "outliers constitute 1% of the data"). It's common in practice to use a robust measure of standard deviation to decide on this cutoff.

Huber's monograph, Robust Statistics, discusses the theoretical properties of his estimator. For more practical matters (implementation and rules of thumb), check out Faraway's very accessible text, Linear Models with R.

Huber loss vs l1 loss

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in ROBUST-STATISTICS

Trending Questions

Popular # Hahtags

Popular Questions