Differentiating trace of matrix product when matrix elements are functions of a vector

157 Views Asked by At

According to a well known formula (Eqs. 100-104 here)

$$\frac{\partial}{\partial B} tr(AB)=A^T$$

For square real-valued matrices $A,B$. For simplicity assume these matrices are symmetric.

But...

1) Say $A_{ij}=f(x_i, x_j), B_{ij}=g(x_i,x_j)$ where $x_i, x_j$ are the elements of a real vector $x$. What becomes of the derivative then?

2) Is the situation made more tractable if we were to know that $B=A^{-1}\circ C$ (where "$\circ$" means hadamard multiplication) and that the elements of $C$ are

$$C_{ij}= \frac{x_i}{A^{-1}_{ij}} \frac{\partial A^{-1}_{ij}}{\partial x_i}$$

(EDIT: In this case I guess $B$ would probably not be symmetric since $C$ is not symmetric unless we impose a condition that $\frac{\partial A^{-1}_{ij}}{\partial x_i} = \frac{\partial A^{-1}_{ij}}{\partial x_j}$.)

1

There are 1 best solutions below

0
On

(1) Assume that the matrices $(A,B)$ are functions of a vector $x$ $$\eqalign{ A &= A(x) &\implies a = {\rm vec}(A) \cr B &= B(x) &\implies b = {\rm vec}(B) \cr }$$ Then the trace function can be written either in terms of the inner product (:) of the matrices or the corresponding vectors $$\eqalign{ \tau &= {\rm tr}(A^TB) = A:B = a:b }$$ Assume that we can calculate non-singular Jacobians of the vector variables $$\eqalign{ db &= K\,dx \cr da &= J\,dx \cr &= J(K^{-1}\,db) \cr }$$ Substitute these into the differential of the trace function, and find its gradient with respect to $b$. $$\eqalign{ d\tau &= a:db + b:da \cr &= a:db + b:JK^{-1}\,db \cr &= a:db + K^{-T}J^Tb:db \cr \frac{\partial\tau}{\partial b} &= a + K^{-T}J^Tb \cr }$$ Reshape this result back into matrix form $$\eqalign{ \frac{\partial\tau}{\partial B} &= {\rm Mat}\bigg(\frac{\partial\tau}{\partial b}\bigg) \cr &= {\rm Mat}\Big(a + K^{-T}J^Tb\Big) \cr }$$ Note that if $A$ is constant, then $J=0$ and this solution simplifies to $$\eqalign{ \frac{\partial\tau}{\partial B} &= {\rm Mat}\Big(a\Big) = A \cr }$$ as expected.



(2) For this case, assume that we can invert the given relationship $$\eqalign{ A^{-1} &= B\oslash C \cr }$$ and therefore $$\eqalign{ dA^{-1} &= dB\oslash C \cr -A^{-1}\,dA\,A^{-1} &= dB\oslash C \cr dA &= -A(dB\oslash C)A \cr }$$ Now find the differential and gradient of the trace $$\eqalign{ d\tau &= A:dB + B:dA \cr &= A:dB - B:A(dB\oslash C)A \cr &= A:dB - (A^TBA^T)\oslash C:dB \cr \frac{\partial\tau}{\partial B} &= A - (A^TBA^T)\oslash C \cr }$$ where ($\oslash$) denotes elementwise/Hadamard division.