Proximal Operator of the Huber Loss Function

2.4k Views Asked by At

I want to solve the following problem:

$$ \arg\min_x |x|_\mu + \frac{1}{2\sigma} |x-x^k|^2 $$

Where the Huber Loss Function is given by:

$$|x|_\mu = \begin{cases} \frac{|x|^2}{2}, & |x| \leq \mu \\ \mu \left( |x|-\frac \mu 2 \right) & |x| > \mu \end{cases}. $$

Then, the optimal condition is $$ 0 = \partial(|\hat x|_\mu) + \frac 1\sigma (\hat x - x^k) $$

However, at this time, I don't know how to proceed.

1

There are 1 best solutions below

13
On BEST ANSWER

To simplify the exposition, consider the standard case: $\mu = 1$ (see justification below). In any dimensions (your case is 1-dimensional), the Huber function is the infimal convolution of the $\ell_1$-norm $x \mapsto \|x\|_1$ and the half-squared $\ell_2$-norm $x \mapsto \frac{1}{2}\|x\|_2^2$, i.e $h = \|.\|_1 \Box \frac{1}{2}\|.\|_2^2$ (prove this as excercise, or ask for specific details / help in comment section...). Thus, $h^* = \|.\|_1^* + (\frac{1}{2}\|.\|_2^2)^* = i_{\mathbb B_\infty} + \frac{1}{2}\|.\|_2^2$, where $\mathbb B_\infty$ is the unit-ball for the $\ell_\infty$-norm in $\mathbb R^n$. Now, by Moreau's proximal decomposition, one computes $$ \begin{split} \frac{y-\mathrm{prox}_{\sigma h}(y)}{\sigma} = \mathrm{prox}_{\frac{1}{\sigma}h^*}\left(\frac{1}{\sigma}y\right) &= \mathrm{arg}\min_{x \in \mathbb B_\infty}\frac{1}{2}\left\|x-\frac{y}{\sigma}\right\|_2^2 + \frac{1}{\sigma}\frac{1}{2}\|x\|_2^2\\ &= \mathrm{arg}\min_{{x \in \mathbb B_\infty}}\frac{\sigma + 1}{2\sigma}\left\|x - \frac{y}{\sigma + 1}\right\|_2^2 + \text{ const.}\\ & = P_{\mathbb B_\infty}\left(\frac{y}{\sigma + 1}\right) = (v_1, v_2, \ldots, v_n), \end{split} $$ where $v_j = \frac{y_j}{\max(|y_j|, \sigma + 1)}$. Thus the $j$th component of $\mathrm{prox}_{\sigma h}(y)$ is given by $$(\mathrm{prox}_{\sigma h}(y))_j = y_j - \frac{\sigma y_j}{\max(|y_j|, \sigma + 1)}. $$

Watch our for computational errors!


Justification of only considering the case $\mu=1$: Indeed for general $\mu$, if $\phi_\mu(x) := |x|_\mu$, then it is easy to check that $\phi_\mu(x) = \mu^2\phi_1(x/\mu)=\mu^2 h(x/\mu)$, where $h := \mu_1$. Thus, $$ \text{prox}_{\sigma\phi_\mu}(y) = \arg\min_{x}\frac{1}{2}\|x-y\|^2 + \sigma \phi_\mu(x) = \arg\min_{x}\frac{1}{2}\|x-y\|^2 + \mu^2\sigma h(x/\mu) = \mu z, $$ where $z = \arg\min_{z}\frac{1}{2}\|\mu z - y\|^2 + \mu^2\sigma h(z) = \arg\min_{z}\frac{1}{2}\|z - y/\mu\|^2 + \sigma h(z) = \text{prox}_{\sigma h}(y/\mu). $

$\therefore \text{prox}_{\sigma\phi_\mu}(y) = \mu \text{prox}_{\sigma h}(y/\mu)$.