Equivalence of two expressions involving the derivative of the exponential map

108 Views Asked by At

While working on a problem involving the derivative of the exponential map, I came across an interesting identity that seems to be true but I can't prove it. Here is the identity:

$$\frac\partial{\partial A}\mathrm{tr}\left(S\exp(A)\right)=\lim_{t\to0}\frac d{dt}\left[\exp(A+tS)\right]$$ where $S$ is a symmetrical definite-positive $n\times n$ matrix, $A$ is a symmetrical $n\times n$ matrix, and $t\in\mathbb{R}$. The function $\exp$ is the usual matrix exponential. The function $\mathrm{tr}$ is the usual matrix trace. All matrices are restricted to real numbers.

Is there a straightforward argument why these two expressions are the same? In this, the matrices $A$ and $S$ are not expected to commute, i.e.: $AS\ne SA$. This implies in particular that $S\exp(A)\ne\exp(A)S$.

Using Duhamel's formula, I can write that: $$\lim_{t\to0}\frac d{dt}\left[\exp(A+tS)\right]=\int_0^1\exp(\tau A)S\exp((1-\tau)A)d\tau$$ but I have no idea yet how to simplify the other part of the identity. I can show that: $$\frac\partial{\partial A}\mathrm{tr}(\exp(A))=\exp(A)$$ but adding the (non-commuting) $S$ inside the trace makes a direct generalization difficult.

1

There are 1 best solutions below

0
On BEST ANSWER

$ \def\o{{\tt1}}\def\p{\partial} \def\BR#1{\left[#1\right]} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\Sk{\sum_{k=\o}^\infty} \def\Sj{\sum_{j=\o}^k} \def\Skj{\Sk\Sj} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\k{\frac{\o}{k!}} \def\c#1{\color{\red}{#1}} $Define the matrix variables $$\eqalign{ B &= A+St \qiq \dot B = S \\ F &= \exp(B) \;=\; I + \Sk\k\:B^k \\ }$$ then use differentials to calculate the derivative of $F$ $$\eqalign{ dF &= \Sk\k\:\c{dB^k} \\ &= \Sk\k\c{\Sj\LR{B^{k-j}\:dB\:B^{j-\o}}} \\ \dot F &= {\Skj\k\LR{B^{k-j}SB^{j-\o}}} \\ \lim_{t\to 0}\dot F &= \Skj\k\LR{A^{k-j}SA^{j-\o}} \\ }$$ Now consider the gradient of the trace expression $$\eqalign{ \phi &= S:e^A \\ &= S:\LR{I+\Sk\k A^k} \\ d\phi &= S:\LR{\Sk\k\:dA^k} \\ &= S:\LR{\Sk\k\Sj A^{k-j}\:dA\:A^{j-\o}} \\ &= \LR{\Skj\k\LR{A^{k-j}SA^{j-\o}}}:dA \\ \grad{\phi}{A} &= \Skj\k\LR{A^{k-j}SA^{j-\o}} \\ }$$


The above derivation uses the Frobenius product, which is a concise notation for the trace $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \|A\|^2_F \\ }$$