Proximal Operator of Huber Loss Function (For $ {L}_{1} $ Regularized Huber Loss of a Regression Function)

Question

Proximal Operator of Huber Loss Function (For $ {L}_{1} $ Regularized Huber Loss of a Regression Function)

711 Views Asked by Bumbble Comm At 09 Apr 2026 - 11:21

I'm trying to derive the ADMM updates for the $\ell_1$ penalized Huber loss:

$$ \arg\min_x \phi_h \left(y - Ax\right) + \gamma\lVert x \rVert_1 $$

where

$$ \phi_h \left( u \right) = \begin{cases} \frac{1}{2}u^2, & \text{if } \mid u \mid \leq 1 \\ \mid u \mid - \frac{1}{2}, & \text{otherwise} \end{cases} $$

So far I know I need to compute the prox operator of both $ \phi_h $ and $ \lVert \rVert_1 $ and that the steps are:

$$ x^{k+1} = \arg \min_x \left(\phi_h\left(y-Ax\right) + \frac{\rho}{2}\lVert y - Ax -z^{k} + u^{k} \rVert \right) $$

$$ z^{k+1} = S_{\gamma/\rho}\left(x^{k+1} + u^{k+1} \right) $$

$$ u^{k+1} = u^{k} + x^{k+1} - z^{k+1}$$

where

$$ S_{\lambda}\left( y \right) = \mathrm{max} \left(y - \lambda, 0 \right) $$

This is from eqn 6.1. from Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers:

I'm having difficulty finding the $x^{k+1}$ step. Boyd (section 6.1.1) suggests that it will be:

$$ \frac{\rho}{1+\rho}\left(Ax - y + u^k\right) + \frac{1}{1+\rho}S_{1+1/\rho}\left( Ax - y + u^k \right) $$

But the answers to Proximal Operator of the Huber Function suggests the $j^{th}$ component of the prox operator will be:

$$ v_j = \frac{y_j-a_j x_j}{max\left(\mid y_j-a_j x_j \mid, 2 \right)} $$

Any help finding this would be hugely appreciated.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The Huber Loss is defined as:

$$ L_\delta \left( x \right) = \begin{cases} \frac{1}{2} {x}^{2} & \text{for} \; \left| x \right| \leq \delta \\ \delta (\left| x \right| - \frac{1}{2} \delta) & \text{for} \; \left| x \right| > \delta \end{cases} $$

For the case the input is a vector the Huber Loss is applied component wise and then all results are summed.

Regarding your question about the difference between derivations of the Proximal Operator for the Huber Loss.
I actually implemented both derivations of the Proximal Operator for the Huber Loss $ {L}_{1} \left( \cdot \right) $ with $ \delta = 1 $ (To match your definition):

$ {\left( \operatorname{prxo}_{ \lambda {L}_{1} \left( \cdot \right) } \left( y \right) \right)}_{i} = {y}_{i} - \frac{\lambda {y}_{i}}{\max \left( \left| {y}_{i} \right|, \lambda + 1 \right)} $ from Proximal Operator of the Huber Loss Function.
$ \operatorname{prxo}_{ \lambda {L}_{1} \left( \cdot \right) } \left( y \right) = \frac{1}{1 + \lambda} y + \frac{\lambda}{1 + \lambda} \mathcal{S}_{1 + \lambda} \left( y \right) $ where $ \mathcal{S}_{\lambda} \left( \cdot \right) $ is the Soft Threshold Operator (The Proximal Opertaor of the $ {L}_{1} $ Norm). This was taken from Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein - Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers from the section called Huber Fitting. While in the book they use $ \rho = \frac{1}{\lambda} $ notation for the Proximal Operator. Hence I adapted it accordingly.

In my code I found both to be equivalent and accurate as I compared them to CVX. The code is available at my StackExchange Mathematics Q2791227 GitHub Repository. The code is extended to support any value of $ \delta $ as in my solution to Proximal Operator / Proximal Mapping of the Huber Loss Function.

Pay attention that the book Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers use the Huber Loss Function for Robust Regression while you're using it for Regularized Robust Regression. Probably you need to adapt $ \lambda $ in your steps.

Proximal Operator of Huber Loss Function (For $ {L}_{1} $ Regularized Huber Loss of a Regression Function)

There are 1 best solutions below

Related Questions in CONVEX-ANALYSIS

Related Questions in CONVEX-OPTIMIZATION

Related Questions in MATLAB

Related Questions in PROXIMAL-OPERATORS

Trending Questions

Popular # Hahtags

Popular Questions