What is the gradient of plus function?

137 Views Asked by At

Let $f:\mathbb{R}^n\to \mathbb{R}$ that $f=\frac{1}{2}\|(Ax-b)_+\|^2$, where $A\in\mathbb{R}^{m\times n}$, $b \in \mathbb{R}^m$ and if $x\in \mathbb{R}^n$: $((x)_+)_i=max\{0,x_i\}$.

I think the gradient of this function is as bellow: \begin{equation} \nabla f(x)=A^T(Ax-b)_+ \end{equation} is It true?

2

There are 2 best solutions below

0
On

Note that $f(x) = g(h(x))$, where $h(x) = Ax - b$ and $$ g(u) = \frac12 \| u_+ \|^2 = \frac12\max(u_1,0)^2 + \cdots + \frac12\max(u_m,0)^2. $$ The function $g$ is certainly differentiable, and $$ g'(u) = u_+^T. $$ Also, the derivative of $h$ is $h’(x) = A$. By the chain rule, $f$ is differentiable and $$ f'(x) = g'(h(x)) h'(x) = (Ax - b)_+^T A. $$ If we use the convention that $\nabla f(x)$ is a column vector, then $$ \nabla f(x) = f'(x)^T = A^T(Ax - b)_+. $$ This confirms that $f$ is a differentiable function and the formula you gave in your question is correct.

0
On

matrixcalculus.org can derivate this and similar functions. I wrote OP's function as 1/2*norm2(relu(A*x-b))^2 there and got

function: $$f = \frac{1}{2}\|\mathrm{relu}(A\cdot x-b)\|_2^{2}$$

gradient: $$\frac{\partial f}{\partial x} = A^\top \cdot (\mathrm{relu}(A\cdot x-b)\odot \mathrm{relu}(\mathrm{sign}(A\cdot x-b))).$$ Here $\mathrm{relu}$ is OP's $x \mapsto x_+$, $\odot$ is elementwise multiplication, $\mathrm{sign}$ is elementwise sign.