Proof : Derivative of the trace of a function

109 Views Asked by At

I've come across the identity \begin{equation} \frac{\partial \text{Tr} \{F[M(x)]\}}{\partial M(x)} = F'[M(x)]^T \end{equation} where F' is the scalar derivative of F but I've never seen the proof of it. Does somebody know how to do it or have a reference?

1

There are 1 best solutions below

0
On BEST ANSWER

$ \def\a{\alpha}\def\b{\beta} \def\o{{\tt1}}\def\p{\partial} \def\L{\left}\def\R{\right} \def\LR#1{\L(#1\R)}\def\BR#1{\Big(#1\Big)} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} $Write the Taylor series of a function of a scalar variable $x$ and its derivative $$\eqalign{ F(x) &= \sum_{k=0}^\infty \a_kx^k, \qquad F'(x) &= \sum_{k=0}^\infty (k\a_k)x^{k-1} \\ }$$ Apply the function to a matrix argument $X$ and take the trace $$\eqalign{ \phi &= \trace{F(X)} \;=\; I:\LR{\sum_{k=0}^\infty \a_kX^k} \\ }$$ Then calculate its differential and gradient $$\eqalign{ d\phi &= I:\LR{\sum_{k=0}^\infty\a_k\;\c{dX^{k}}} \\ &= I:\LR{\sum_{k=0}^\infty\a_k\;\c{\sum_{j=\o}^k X^{j-\o}dX\;X^{k-j}}} \\ &= \LR{\sum_{k=0}^\infty\a_k\;\sum_{j=\o}^k\LR{X^{j-\o}}^TI\;\LR{X^{k-j}}^T}:dX \\ &= \LR{\sum_{k=0}^\infty\a_k\,\LR{k X^{k-\o}}}^T:dX \\ &= F'(X)^T:dX \\ \grad{\phi}{X} &= F'(X)^T \\\\ }$$


In the preceding, a colon is used as a convenient product notation for the trace, e.g. $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \big\|A\big\|^2_F \\ I:B &= \trace{I^TB} \;=\; \trace{B} \\ }$$ The properties of the underlying trace function allow the terms in such a product to be rearranged in many different but equivalent ways, e.g. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ C:AB &= CB^T:A = A^TC:B \\ }$$