why is argmin $\|w\|^2$ equivalent to $\operatorname{argmax} 1/\|w\|$

1.7k Views Asked by At

I was wondering why the maximization of $1/\|w\|$ is equivalent to minimizing the squared norm of $w$. Shouldn't it be equivalent to just minimizing the norm of $w$?

This is a very basic optimization question but I am having trouble seeing the intuition behind this concept.

Thanks in advance.

2

There are 2 best solutions below

0
On

Since $x \mapsto x^2$ is a strictly increasing function on $[0,\infty)$, minimizing $\|w\|$ and minimizing $\|w\|^2$ will produce the same minimizers.

The same would be true for any strictly increasing function $\phi$ (on $[0,\infty)$), for example $\phi(x) = x^3+42$, but the square of the Euclidean norm is particularly amenable to simple calculations.

1
On

You are right, but there is still good reason to look at argmin $\|w\|^2$ instead of argmin $\|w\|$. Consider the situation in the context of the real line. The function $x \mapsto \|x\|$ is not differentiable at $0$, but the function $x \mapsto \|x\|^2$ is. So it's much easier to do optimization with $\|\cdot\|^2$ than $\|\cdot\|$ because you can do calculus with the former.

If you know some statistics, this question about the square in variance might be interesting: https://stats.stackexchange.com/questions/118/why-square-the-difference-instead-of-taking-the-absolute-value-in-standard-devia