Derivative of an exponential matrix that involve trace operator

36 Views Asked by At

I know this is a stupid question, but I got this weird doubt. Then I need to derive this expression with respect to each element of the $\Sigma$ matrix

$$ \varphi(t) = \exp\big(Tr[A(t)\Sigma] + C(t)\big) $$

Now if I use the property $\mathrm{det}[\exp(A)] = \exp(Tr[A])$ and the fact that trace is a linear operator, in theory the result should be that for each element $ij$

$$ \frac{\partial}{\partial \Sigma_{ij}} \varphi = \exp\big(Tr[A(t)\Sigma] + C(t)\big) \times a_{ij} = \varphi \times a_{ij} $$

with $a_{ij}$ an element of the matrix $A$, It is correct right?

1

There are 1 best solutions below

1
On BEST ANSWER

$ \def\R#1{{\mathbb R}^{#1}} \def\g{\gamma} \def\o{{\tt1}} \def\LR#1{\left(#1\right)} \def\LMR#1#2{\left[ #1 \middle| #2 \right]} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} $Let's use a convention wherein an uppercase letter represents a matrix, lowercase a vector, and a Greek letter will denote a scalar.

Next, I'll assume that $C(t)$ is a scalar (since you're adding it to a trace which is a scalar valued function) so let's assign it a greek name $\:C\to\g.\;$ And let's rename the matrix $\:\Sigma\to S.$

The matrix inner product (denoted by a colon) is extremely useful and has the following properties $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ I:A &= \trace{A} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \}\\ A:B &= B:A \;=\; B^T:A^T \\ \LR{AB}:C &= A:\LR{CB^T} \;=\; B:\LR{A^TC} \\ }$$ Finally, for typing convenience define the matrix variable $$\eqalign{ X &= \LR{AS + \frac{\g I}{n}} \;\in\; \R{n\times n} \\ \trace X &= \trace{AS} + \g \\ }$$ Use the above notation to write the logarithm of your function and calculate its gradient wrt $S$ $$\eqalign{ \log(\psi) &= I:X \\ \frac{d\psi}{\psi} &= I:dX = I:\LR{A\,dS} = A^T:dS \\ d\psi &= \psi A^T:dS \\ \grad{\psi}{S} &= \psi A^T \qiq \grad{\psi}{S_{ij}} = \psi A_{ji} \\ }$$