I'm trying to derive the ADMM updates for the $\ell_1$ penalized Huber loss:
$$ \arg\min_x \phi_h \left(y - Ax\right) + \gamma\lVert x \rVert_1 $$
where
$$ \phi_h \left( u \right) = \begin{cases} \frac{1}{2}u^2, & \text{if } \mid u \mid \leq 1 \\ \mid u \mid - \frac{1}{2}, & \text{otherwise} \end{cases} $$
So far I know I need to compute the prox operator of both $ \phi_h $ and $ \lVert \rVert_1 $ and that the steps are:
$$ x^{k+1} = \arg \min_x \left(\phi_h\left(y-Ax\right) + \frac{\rho}{2}\lVert y - Ax -z^{k} + u^{k} \rVert \right) $$
$$ z^{k+1} = S_{\gamma/\rho}\left(x^{k+1} + u^{k+1} \right) $$
$$ u^{k+1} = u^{k} + x^{k+1} - z^{k+1}$$
where
$$ S_{\lambda}\left( y \right) = \mathrm{max} \left(y - \lambda, 0 \right) $$
This is from eqn 6.1. from Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers:
I'm having difficulty finding the $x^{k+1}$ step. Boyd (section 6.1.1) suggests that it will be:
$$ \frac{\rho}{1+\rho}\left(Ax - y + u^k\right) + \frac{1}{1+\rho}S_{1+1/\rho}\left( Ax - y + u^k \right) $$
But the answers to Proximal Operator of the Huber Function suggests the $j^{th}$ component of the prox operator will be:
$$ v_j = \frac{y_j-a_j x_j}{max\left(\mid y_j-a_j x_j \mid, 2 \right)} $$
Any help finding this would be hugely appreciated.
The Huber Loss is defined as:
$$ L_\delta \left( x \right) = \begin{cases} \frac{1}{2} {x}^{2} & \text{for} \; \left| x \right| \leq \delta \\ \delta (\left| x \right| - \frac{1}{2} \delta) & \text{for} \; \left| x \right| > \delta \end{cases} $$
For the case the input is a vector the Huber Loss is applied component wise and then all results are summed.
Regarding your question about the difference between derivations of the Proximal Operator for the Huber Loss.
I actually implemented both derivations of the Proximal Operator for the Huber Loss $ {L}_{1} \left( \cdot \right) $ with $ \delta = 1 $ (To match your definition):
In my code I found both to be equivalent and accurate as I compared them to CVX. The code is available at my StackExchange Mathematics Q2791227 GitHub Repository. The code is extended to support any value of $ \delta $ as in my solution to Proximal Operator / Proximal Mapping of the Huber Loss Function.
Pay attention that the book Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers use the Huber Loss Function for Robust Regression while you're using it for Regularized Robust Regression. Probably you need to adapt $ \lambda $ in your steps.