For the preparation of my lecture, I'm trying to derive the proximal operator for the $L_2$-norm, i.e., the solution of: $$ \hat{\mathbf{x}}(\mathbf{y}) = \underset{\mathbf{x}}{\text{arg min}}\quad \underbrace{\frac{1}{2} \Vert \mathbf{x} - \mathbf{y} \Vert^2_2 + \lambda \Vert \mathbf{x} \Vert_2}_{:=f(\mathbf{x})} \,. $$ I'm trying to derive the popular answer $$ \hat{\mathbf{x}}(\mathbf{y}) = \max \lbrace 0, \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \rbrace \cdot \mathbf{y} $$ without using Moreau's decomposition.
For the case $\mathbf{x} \neq \mathbf{0}$, the gradient $\nabla f(\mathbf{x})$ can be calculated as: $$ \nabla f(\mathbf{x}) = \mathbf{x} - \mathbf{y} + \lambda \frac{\mathbf{x}}{\Vert\mathbf{x}\Vert_2} . $$ (I will continue on this later).
For $\mathbf{x} = \mathbf{0}$, I will use subdifferentials, denoted by $\partial$. It holds: \begin{align} \partial f(\mathbf{x}) = \partial \Vert \mathbf{x} - \mathbf{y} \Vert^2_2 + \partial \lambda \Vert \mathbf{x} \Vert_2 = \mathbf{x} - \mathbf{y} + \partial \lambda \Vert \mathbf{x} \Vert_2 , \end{align} with $$ \partial \lambda \Vert \mathbf{x} \Vert_2 = \lbrace \mathbf{g}: \Vert \mathbf{g} \Vert_2 \leq \lambda \rbrace , $$ and after setting $\mathbf{x} = \mathbf{0}$: $$ \partial f(\mathbf{\mathbf{0}}) = -\mathbf{y} + \partial \lambda \Vert \mathbf{x} \Vert_2 . $$ $$ \mathbf{y} = \partial \lambda \Vert \mathbf{x} \Vert_2 .$$ Since $\mathbf{0} \in \partial \lambda \Vert \mathbf{x} \Vert_2$, we can return $\mathbf{0}$ in this case.
Now back to the case of $\mathbf{x} \neq \mathbf{0}$. We will set the gradient to $\mathbf{0}$ and solve for $\mathbf{x}$:
$$ \mathbf{x} - \mathbf{y} + \lambda \frac{\mathbf{x}}{\Vert \mathbf{x} \Vert_2} \overset{!}{=} \mathbf{0} $$
$$ \mathbf{x} \left( 1 + \frac{\lambda}{\Vert \mathbf{x} \Vert_2} \right) = \mathbf{y}$$ now apply $\Vert \cdot \Vert_2$: $$ \Vert \mathbf{x} \Vert_2 \left( 1 + \frac{\lambda}{\Vert \mathbf{x} \Vert_2} \right) = \Vert \mathbf{y} \Vert_2 $$ and solve for $\Vert \mathbf{x} \Vert_2$: $$ \Vert \mathbf{x} \Vert_2 = \Vert \mathbf{y} \Vert_2 - \lambda $$ Now use this to replace $\Vert \mathbf{x} \Vert_2$ in the second step of this derivation to obtain: $$ \mathbf{x} \left( 1 + \frac{\lambda}{\Vert \mathbf{y} \Vert_2 - \lambda} \right) = \mathbf{y} $$ and solve for $\mathbf{x}$: $$ \mathbf{x} = \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} . $$
Now, in order to make sure that $\mathbf{x} \neq \mathbf{0}$, it has to hold $$ \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} \neq \mathbf{0}, $$ i.e., $$ \Vert \mathbf{y} \Vert_2 \neq \lambda . $$
In total, this would result in the solution $$ \hat{\mathbf{x}} (\mathbf{y}) = \left( 1 - \frac{\lambda}{\Vert \mathbf{y} \Vert_2} \right) \cdot \mathbf{y} \quad \text{for } \Vert \mathbf{y} \Vert_2 \neq \lambda $$ and $$ \hat{\mathbf{x}} (\mathbf{y}) = \mathbf{0} \quad \text{for } \Vert \mathbf{y} \Vert_2 = \lambda.$$ Apparently, this differs from the common solution stated at the beginning of my question. Many thanks in advance for any helping input!
You may be interested in the website http://proximity-operator.net that lists many different proximity operators (+ code).
For instance, the one of $\|.\|_2$ is the first on this page http://proximity-operator.net/multivariatefunctions.html
Moreover, as mentioned there, the proof of the prox formula is given in Examples 2.15 and 2.16 (p. 8-9) of the paper
To prove it, they use the fact that $\|x\|$ is the distance $d(x,C)$ between $x$ and the convex set $C$ when $C$ is the singleton $\{0\}$, with ${\rm prox}_d(.,C)$ proved in Ex 2.15.
I guess it could be possible to simplify this proof for students.
However, Moreau's identity is useful to teach. It's a generalization of the decomposition of vectors over a component along a subspace and another along its orthogonal complement -- a fact they generally used and proved in matrix algebra courses. Just pick the convex function of a prox as the indicator of the subspace, and its conjugate will be the indicator on the orthogonal complement and the rest follows.
Hope this helps.