Derivative of matrix product of matrix and Hadamard product of 2 matrices

283 Views Asked by At

Right now I'm trying to find a derivative that's stumping me:

Let $A, B$ be $m \times n$ matrices and $W$ be a $p\times m$ matrix.

$f = W \bullet (A \circ B)$

$\frac{\partial f}{\partial A} = ?$

(The $\bullet$ represents matrix multiplication and the $\circ$ represents taking Hadamard product.)

From the post Derivative of Hadamard product, I've seen that for $g = A \circ B$, $\frac{\partial g}{\partial A} = B:M$, where $M$ is a 6th-order tensor with $M_{ijklmn}=1$ if $(i=k=m)$ and $(j=l=n)$, $0$ otherwise. However, I'm not sure how to deal with the $W$? Thanks.

1

There are 1 best solutions below

1
On

The gradient is a fourth-order tensor $\big(\Gamma\big)$, which is calculated as follows. $$\eqalign{ F &= W\cdot(B\circ A) \\ &= W\cdot (B:{\mathbb M}:A) \\ &= W\cdot (B:{\mathbb M}):A \\ &= (W\cdot{\mathbb M}:B):A \\ dF &= (W\cdot{\mathbb M}:B):dA \\ \Gamma = \frac{\partial F}{\partial A} &= W\cdot{\mathbb M}:B \\ }$$ Here are those last few lines in index notation. $$\eqalign{ dF_{pq} &= W_{pk}B_{ij}{\mathbb M}_{ijkqmn}\,dA_{mn} \\ \Gamma_{pqmn} = \frac{\partial F_{pq}}{\partial A_{mn}} &= W_{pk}B_{ij}{\mathbb M}_{ijkqmn} \\ &= W_{pk}{\mathbb M}_{kqmnij}B_{ij} \\ }$$ The three index-pairs on ${\mathbb M}$ can be rearranged as needed, e.g. $$\eqalign{ {\mathbb M}_{\,ij\,kq\,mn} &= {\mathbb M}_{\,ij\,kq\,mn} \\ &= {\mathbb M}_{\,kq\,mn\,ij} \\ &= {\mathbb M}_{\,ij\,mn\,kq} \\ &= etc. \\ }$$


The problem can also be approached by vectorizing the matrices. $$\eqalign{ a &= \operatorname{vec}(A),\quad b = \operatorname{vec}(B),\quad f = \operatorname{vec}(F) \\ {\cal B} &= \operatorname{Diag}(b) \\ F &= W\cdot(B\circ A)\cdot I\\ f &= (I\otimes W)\cdot(b\circ a) \\ df &= (I\otimes W)\cdot(b\circ da) \\ &= (I\otimes W)\cdot({\cal B}\cdot da) \\ \frac{\partial f}{\partial a} &= (I\otimes W)\cdot{\cal B} \\ }$$ where $\otimes$ is the Kronecker product and $I$ is the identity matrix.