I am reading the book "Convex optimization". In example 4.3 the author says that $$||Ax-b||_2$$ is not differentiable at any point where $Ax-b=0$ where $A$ is a matrix of size $n\times m$ and $x,b$ are a vectors of $m$ and $n$ rows respectively. On the other hand, he says that $||ax-b||_2^2$ is differentiable. Can any body explain how?
My logic for $||Ax-b||_2$:
I can understand that since norm is always positive so for a certain range of $x$ the value $Ax-b$ will first decrease and then increase (will make a kind of V shape). At the time when it touches zero its derivative switches from negative to positive (or vice versa) with any increment in the $x$ so we can say that it is non-differentiable. But this logic does not apply to $||Ax-b||_2^2$. Can any body help me in understanding it? Thanks in advance.
Take a look at it in $1$ dimension. There, $A$ becomes a real number, $x$ is a single variable, and $b$ is a number. The first expression becomes
$$|ax-b|$$
which is differentiable (and, indeed, linear) around all $x$ for which $ax-b\neq 0$, but it is not differentiable if $ax-b=0$ (unless $a=b=0$, in which case the expression is always $0$).
The second expression, however, becomes $$|ax-b|^2$$ which is the same as $$(ax-b)^2$$
which is a polynomial function and therefore differentiable everywhere.
Now, in many dimensions, something fairly similar happens. If you vary just one variable of the vector $x$, you get a linear function under the absolute value which has a "breaking" point where it reaches zero, and the second function is just a polynomial.