A question about matrix derivatives

71 Views Asked by At

I came across the following equality for the symmetric matrix $H = A + A^T$, where $A$ is an $N \times N$ matrix.

$$\sum_{i,j}\frac{\partial}{\partial A_{ij}}(H^n)_{ij} = n\text{Tr}(H^{n-1}) + \sum_{i=0}^{n-1}\text{Tr}(H^i)\text{Tr}(H^{n-1-i})$$

I'm not quite sure how this is derived. My first thought was to try and use induction since the case when $n = 1$ is trivial, but that was not fruitful. I also tried to use the binomial expansion, but that did not simplify things either. Any explanation would be greatly appreciated.

1

There are 1 best solutions below

2
On BEST ANSWER

There are probably some established rules that can be used but I prefer to derive from first principles for such questions. Let's first think about the one variable case. By general Leibniz rule, $$ \frac{dH^n}{dt} = \sum_{a+b=n-1} H^a\frac{dH}{dt}H^b$$

When $t=a_{ij}$, we have $\frac{\partial H}{\partial a_{ij}}=E_{ij}+E_{ji}$ where $E_{ij}$ is the matrix whose $(i,j)$-entry is $1$ and all other entries are $0$.

Now $$(\frac{\partial H^n}{\partial a_{ij}})_{ij}=\sum_{a+b=n-1} \sum_{p, q}(H^a)_{ip}(E_{ij}+E_{ji})_{pq} (H^b)_{qj}$$

$(E_{ij}+E_{ji})_{pq}$ is only nonzero when $(p,q)=(i,j)$ or $(p,q)=(j,i)$.

So $$(\frac{\partial H^n}{\partial a_{ij}})_{ij}=\sum_{a+b=n-1} (H^a)_{ii}(H^b)_{jj}+\sum_{a+b=n-1} (H^a)_{ij}(H^b)_{ij}$$ And $$\sum_{i,j} \sum_{a+b=n-1} (H^a)_{ii}(H^b)_{jj}=\sum_{a+b=n-1}(\sum_i (H^a)_{ii})(\sum_j (H^b)_{jj})=\sum_{a+b=n-1}\operatorname{Tr}(H^a)\operatorname{Tr}(H^b)$$

By $H$ is symmetric, $(H^b)_{ij}=(H^b)_{ji}$, hence

$$\sum_{i,j}\sum_{a+b=n-1} (H^a)_{ij}(H^b)_{ij}=\sum_{a+b=n-1}\sum_i (\sum_j(H^a)_{ij}(H^b)_{ji})=\sum_{a+b=n-1}\sum_i (H^{a+b})_{ii}=\sum_{a+b=n-1}\operatorname{Tr}(H^{a+b})=n\operatorname{Tr}(H^{n-1})$$