Correct cases for proximal operator of $L_2$ / Euclidean norm when solving without Moreau's decomposition

Question

Correct cases for proximal operator of $L_2$ / Euclidean norm when solving without Moreau's decomposition

152 Views Asked by Bumbble Comm At 25 Mar 2026 - 11:08

For the preparation of my lecture, I'm trying to derive the proximal operator for the $L_2$-norm, i.e., the solution of: $$ \hat{\mathbf{x}}(\mathbf{y}) = \underset{\mathbf{x}}{\text{arg min}}\quad \underbrace{\frac{1}{2} \Vert \mathbf{x} - \mathbf{y} \Vert^2_2 + \lambda \Vert \mathbf{x} \Vert_2}_{:=f(\mathbf{x})} \,. $$ I'm trying to derive the popular answer $$ \hat{\mathbf{x}}(\mathbf{y}) = \max \lbrace 0, \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \rbrace \cdot \mathbf{y} $$ without using Moreau's decomposition.

For the case $\mathbf{x} \neq \mathbf{0}$, the gradient $\nabla f(\mathbf{x})$ can be calculated as: $$ \nabla f(\mathbf{x}) = \mathbf{x} - \mathbf{y} + \lambda \frac{\mathbf{x}}{\Vert\mathbf{x}\Vert_2} . $$ (I will continue on this later).

For $\mathbf{x} = \mathbf{0}$, I will use subdifferentials, denoted by $\partial$. It holds: \begin{align} \partial f(\mathbf{x}) = \partial \Vert \mathbf{x} - \mathbf{y} \Vert^2_2 + \partial \lambda \Vert \mathbf{x} \Vert_2 = \mathbf{x} - \mathbf{y} + \partial \lambda \Vert \mathbf{x} \Vert_2 , \end{align} with $$ \partial \lambda \Vert \mathbf{x} \Vert_2 = \lbrace \mathbf{g}: \Vert \mathbf{g} \Vert_2 \leq \lambda \rbrace , $$ and after setting $\mathbf{x} = \mathbf{0}$: $$ \partial f(\mathbf{\mathbf{0}}) = -\mathbf{y} + \partial \lambda \Vert \mathbf{x} \Vert_2 . $$ $$ \mathbf{y} = \partial \lambda \Vert \mathbf{x} \Vert_2 .$$ Since $\mathbf{0} \in \partial \lambda \Vert \mathbf{x} \Vert_2$, we can return $\mathbf{0}$ in this case.

Now back to the case of $\mathbf{x} \neq \mathbf{0}$. We will set the gradient to $\mathbf{0}$ and solve for $\mathbf{x}$:

$$ \mathbf{x} - \mathbf{y} + \lambda \frac{\mathbf{x}}{\Vert \mathbf{x} \Vert_2} \overset{!}{=} \mathbf{0} $$

$$ \mathbf{x} \left( 1 + \frac{\lambda}{\Vert \mathbf{x} \Vert_2} \right) = \mathbf{y}$$ now apply $\Vert \cdot \Vert_2$: $$ \Vert \mathbf{x} \Vert_2 \left( 1 + \frac{\lambda}{\Vert \mathbf{x} \Vert_2} \right) = \Vert \mathbf{y} \Vert_2 $$ and solve for $\Vert \mathbf{x} \Vert_2$: $$ \Vert \mathbf{x} \Vert_2 = \Vert \mathbf{y} \Vert_2 - \lambda $$ Now use this to replace $\Vert \mathbf{x} \Vert_2$ in the second step of this derivation to obtain: $$ \mathbf{x} \left( 1 + \frac{\lambda}{\Vert \mathbf{y} \Vert_2 - \lambda} \right) = \mathbf{y} $$ and solve for $\mathbf{x}$: $$ \mathbf{x} = \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} . $$

Now, in order to make sure that $\mathbf{x} \neq \mathbf{0}$, it has to hold $$ \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} \neq \mathbf{0}, $$ i.e., $$ \Vert \mathbf{y} \Vert_2 \neq \lambda . $$

In total, this would result in the solution $$ \hat{\mathbf{x}} (\mathbf{y}) = \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} \quad \text{for } \Vert \mathbf{y} \Vert_2 \neq \lambda $$ and $$ \hat{\mathbf{x}} (\mathbf{y}) = \mathbf{0} \quad \text{for } \Vert \mathbf{y} \Vert_2 = \lambda.$$ Apparently, this differs from the common solution stated at the beginning of my question. Many thanks in advance for any helping input!

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Answer 1 · 2022-09-21 13:03:59

You may be interested in the website http://proximity-operator.net that lists many different proximity operators (+ code).

For instance, the one of $\|.\|_2$ is the first on this page http://proximity-operator.net/multivariatefunctions.html

Moreover, as mentioned there, the proof of the prox formula is given in Examples 2.15 and 2.16 (p. 8-9) of the paper

Combettes, P. L., & Wajs, V. R. (2005). Signal recovery by proximal forward-backward splitting. Multiscale modeling & simulation, 4(4), 1168-1200. https://www.ljll.math.upmc.fr/~plc/mms1.pdf

To prove it, they use the fact that $\|x\|$ is the distance $d(x,C)$ between $x$ and the convex set $C$ when $C$ is the singleton $\{0\}$, with ${\rm prox}_d(.,C)$ proved in Ex 2.15.

I guess it could be possible to simplify this proof for students.

However, Moreau's identity is useful to teach. It's a generalization of the decomposition of vectors over a component along a subspace and another along its orthogonal complement -- a fact they generally used and proved in matrix algebra courses. Just pick the convex function of a prox as the indicator of the subspace, and its conjugate will be the indicator on the orthogonal complement and the rest follows.

Hope this helps.

Correct cases for proximal operator of $L_2$ / Euclidean norm when solving without Moreau's decomposition

There are 1 best solutions below

Related Questions in OPTIMIZATION

Related Questions in NORMED-SPACES

Related Questions in CONVEX-OPTIMIZATION

Related Questions in PROXIMAL-OPERATORS

Trending Questions

Popular # Hahtags

Popular Questions