Derivative of trace of antisymmetrical Matrix

59 Views Asked by At

it is easy to show that the derivative of the trace $Tr(A)$ with respect to $A$ if A is a N x N matrix, is

$\frac{\partial Tr(A)}{\partial A} = I_{N \times N}$

However, if I now assume A to be antisymmetric $a_{ij} = -a_{ji}$ the trace $Tr(A) = 0$ but with above result the derivative would still be $I_{N \times N}$ and not $= 0$.

In principle the above solution should not be depend on what properties $A$ has. Yet it seems like it does.

What am I missing here?

1

There are 1 best solutions below

1
On

$ \def\l{\lambda} \def\o{{\tt1}} \def\BR#1{\Big[#1\Big]} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\qif{\quad\iff\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $The utility of the gradient $(G)$ is that it can estimate (via a $\o^{st}$ order Taylor series) the change in the function $(f)$ which results from a small change $(dA)$ in the matrix argument $(A)$ $$\eqalign{ f\LR{A+dA} &= f\LR{A} \;&+\; \trace{G^TdA} \\ \trace{A+dA} &= \trace{A} \;&+\; \trace{dA} \\ }$$ where the last line sets $f\LR{A}=\trace{A}\,$ and $\,G=I,\:$ i.e. your particular function.

If both $A$ and $dA$ are skew symmetric, then all of the terms in the Taylor series are zero $$\eqalign{ \trace{A} = \trace{dA} = 0 \qiq \trace{A+dA} = 0 \\ }$$ However, suppose you decide to explore a matrix direction $(dA)$ that is not skew symmetric. Amazingly, the Taylor series will still give you the correct estimate for this situation $$\eqalign{ \trace{A+dA} &= \trace{dA} \\ }$$ What if $dA$ is skew symmetric but $A$ isn't? Guess what, the Taylor series remains valid $$\eqalign{ \trace{A+dA} &= \trace{A} \\ }$$ And if neither $A$ nor $dA$ are skew symmetric, the Taylor series continues to work its magic $$\eqalign{ \trace{A+dA} &= \trace{A} \;+\; \trace{dA} \\ \\ }$$


Please note, I'm not poking fun at your question. In fact, I think it's a very important question. I'm just trying to emphasize that (as you suspected) the gradient $(G=I)\,$ doesn't depend on the symmetric properties of $A$.