Derivative of $l_2$ norm w.r.t matrix

2.8k Views Asked by At

I have a matrix $A$ which is of size $m \times n$, a vector $B$ which of size $n \times 1$ and a vector $c$ which of size $m \times 1$. I'd like to take the derivative of the following function w.r.t to $A$:

$f(A) = \lVert A \times B - c\rVert_2^2$

Notice that this is a $l_2$ norm not a matrix norm, since $A \times B$ is $m \times 1$. I am using this in an optimization problem where I need to find the optimal $A$.

1

There are 1 best solutions below

2
On BEST ANSWER

Let $f:A\in M_{m,n}\rightarrow f(A)=(AB-c)^T(AB-c)\in \mathbb{R}$ ; then its derivative is

$Df_A:H\in M_{m,n}(\mathbb{R})\rightarrow 2(AB-c)^THB$.

If you want its gradient:

$Df_A(H)=trace(2B(AB-c)^TH)$ and $\nabla(f)_A=2(AB-c)B^T$.

EDIT 1. Some details for @ Gigili. Let $Z$ be open in $\mathbb{R}^n$ and $g:U\in Z\rightarrow g(U)\in\mathbb{R}^m$.

  1. Its derivative in $U$ is the linear application $Dg_U:H\in \mathbb{R}^n\rightarrow Dg_U(H)\in \mathbb{R}^m$; its associated matrix is $Jac(g)(U)$ (the $m\times n$ Jacobian matrix of $g$); in particular, if $g$ is linear, then $Dg_U=g$. Derivative of a product: $D(fg)_U(h)=Df_U(H)g+fDg_U(H)$. Example: if $g:X\in M_n\rightarrow X^2$, then $Dg_X:H\rightarrow HX+XH$. Derivative of a composition: $D(f\circ g)_U(H)=Df_{g(U)}\circ Dg_U(H)$.
  2. Let $m=1$; the gradient of $g$ in $U$ is the vector $\nabla(g)_U\in \mathbb{R}^n$ defined by $Dg_U(H)=<\nabla(g)_U,H>$; when $Z$ is a vector space of matrices, the previous scalar product is $<X,Y>=tr(X^TY)$. Note that $\nabla(g)(U)$ is the transpose of the row matrix associated to $Jac(g)(U)$.

Here $Df_A(H)=(HB)^T(AB-c)+(AB-c)^THB=2(AB-c)^THB$ (we are in $\mathbb{R}$).

Thus $Df_A(H)=tr(2B(AB-c)^TH)=tr((2(AB-c)B^T)^TH)=<2(AB-c)B^T,H>$ and $\nabla(f)_A=2(AB-c)B^T$.

EDIT 2. @ user79950 , it seems to me that you want to calculate $\inf_A f(A)$; if yes, then to calculate the derivative is useless. Indeed, if $B=0$, then $f(A)$ is a constant; if $B\not= 0$, then always, there is $A_0$ s.t. $A_0B=c$ and the inferior bound is $0$.