Derivative of the product of a matrix scalar function and a matrix with respect to a matrix

101 Views Asked by At

How to take derivative of \begin{align*} \dfrac{\partial\left[\operatorname{tr}(\boldsymbol{A}^2)\cdot\boldsymbol{A}^3\right]}{\partial\boldsymbol{A}}=\,? \end{align*} with respect to a matrix $\boldsymbol{A}$, where $\operatorname{tr}(\cdot)$ is the scalar function of the matrix, viz., the trace of the matrix.

I tried using the derivation formula below, but this formula doesn't seem to work.

\begin{align*} \dfrac{\partial\left[\underset{n\times\,n}{\boldsymbol{A}}(\boldsymbol{X})\cdot\underset{n\times\,n}{\boldsymbol{B}}(\boldsymbol{X})\right]}{\partial\boldsymbol{X}}=\dfrac{\partial\left[\underset{n\times\,n}{\boldsymbol{A}}(\boldsymbol{X})\right]}{\partial\boldsymbol{X}}\left[\boldsymbol{E}_{n}\otimes\underset{n\times\,n}{\boldsymbol{B}}(\boldsymbol{X})\right] +\left[\boldsymbol{E}_{n}\otimes\underset{n\times\,n}{\boldsymbol{A}}(\boldsymbol{X})\right]\dfrac{\partial\left[\underset{n\times\,n}{\boldsymbol{B}}(\boldsymbol{X})\right]}{\partial\boldsymbol{X}} \end{align*}

2

There are 2 best solutions below

3
On

$\operatorname{tr} ( (A+H)^2 ) (A+H)^3 - \operatorname{tr} (A^2) A^3 = \operatorname{tr} (AH+HA) A^3 + \operatorname{tr} (A^2)(A^2H +AHA+HA^2) + O(H^2)$ so the derivative is $H \mapsto 2\operatorname{tr} (AH) A^3 + \operatorname{tr} (A^2)(A^2H +AHA+HA^2) $

0
On

$ \def\b{\beta}\def\p{\partial} \def\E{{\cal E}}\def\Eij{E_{ij}}\def\G{{\cal G}} \def\LR#1{\left(#1\right)} \def\vecc#1{\operatorname{vec}\LR{#1}} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} \def\d{\;} $Break the problem into a scalar piece $$\eqalign{ \b &= \trace{A^2} \\ d\b &= 2\trace{A\;dA} \\ }$$ and a matrix piece $$\eqalign{ B &= A\d A\d A \\ dB &= dA\d A\d A\;+\;A\d dA\d A\;+\;A\d A\d dA \\\\ }$$ Multiply them to obtain the function of interest $$\eqalign{ F &= \b B \\ dF &= B\,d\b + \b\,dB \\ &= 2\trace{A\;dA}\,B + \b\LR{dA\d A^2+A\d dA\d A+A^2\,dA} \\ \grad{F}{A_{ij}} &= 2\trace{A\Eij}\,B+\b\LR{\Eij\,A^2+A\Eij\,A+A^2\,\Eij} \\ &= 2A_{ji}B+\b\LR{\Eij\,A^2+A\Eij\,A+A^2\,\Eij} \\ }$$ where $\Eij$ is a matrix whose components are all zero except for the $(i,j)$ component which is equal to one. Conveniently, it also represents the gradient of a matrix with respect to its $(i,j)$ component $$\grad{A}{A_{ij}} = \Eij$$