Proximal Operator / Proximal Mapping Scaling Property

788 Views Asked by At

According to Algorithms for Large Scale Convex Optimization — DTU 2010 - Proximal Gradient Method it holds that for $h(x) = f(\lambda x)$ it holds that

$$ prox_h(x) = \frac{1}{\lambda} prox_{\lambda^2 f}(\lambda x) $$

where the proximal operator is defined as $prox_h(x) = \arg\min_u h(u) + \frac{1}{2} ||u - x||^2$.

I have the following

$$ prox_h(x) = \arg\min_u h(u) + \frac{1}{2} ||u - x||^2 = \arg\min_u f(\lambda u) + \frac{1}{2} ||u - x||^2 $$

and

$$ \frac{1}{\lambda} prox_{\lambda^2 f}(\lambda x) = \frac{1}{\lambda} \arg\min_u \lambda^2 f(u) + \frac{1}{2} ||u - \lambda x||^2 $$

But I don't see how

$$ \arg\min_u f(\lambda u) + \frac{1}{2} ||u - x||^2 = \frac{1}{\lambda} \arg\min_u \lambda^2 f(u) + \frac{1}{2} ||u - \lambda x||^2 $$

1

There are 1 best solutions below

0
On BEST ANSWER

Let's start with $\text{prox}_h(x) = \arg \min_u f(\lambda u) + \frac12 \| u - x \|^2$.

Let's rewrite this optimization problem in terms of $w = \lambda u$. Once we find $w^\star$, which is an optimal choice of $w$, we will have \begin{align*} u^\star &= \frac{1}{\lambda} w^\star \\ &= \frac{1}{\lambda} \arg \min_w \quad f(w) + \frac12 \left\| \frac{w}{\lambda} - x \right \|^2 \\ &= \frac{1}{\lambda} \arg \min_w \quad f(w) + \frac{1}{2\lambda^2} \| w - \lambda x \|^2 \\ &= \frac{1}{\lambda} \text{prox}_{\lambda^2 f}(\lambda x). \end{align*}