Matrix derivative of trace(AB) and ln(det(A)) with respect to a vector

451 Views Asked by At

I am confused by myself on matrix derivation with respect to a vector. I wish to get some help from all of you. Thanks in advance!

Both $A\left( \mathbf{\theta }% \right) $ and $B\left( \mathbf{\theta }\right) $ are nonsingular square matrices of a vector $\mathbf{\theta ,}$ I am looking for the following matrix derivative: $\frac{\partial tr\left( A\left( \mathbf{\theta }\right) ^{-1}B\left( \mathbf{\theta }\right) \right) }{\partial \mathbf{\theta }}.$

As an example, $A\left( \mathbf{\theta }% \right) =\left( \mathbf{I}+\mathbf{\theta \theta }^{\prime }\right) $ where $% \mathbf{I}$ is an identity matrix, and $B\left( \mathbf{\theta }\right) =\left( \mathbf{C}+\mathbf{a\mathbf{\theta }^{\prime }+\theta b}^{\prime }+% \mathbf{\theta \theta }^{\prime }\right) $ where $\mathbf{C}$ is a matrix of constants, both $\mathbf{a}$ and $\mathbf{b}$ are vectors of constants assuming the dimension matches. Is there any chain rule for the derivatives? Thanks

1

There are 1 best solutions below

3
On BEST ANSWER

The Matrix Cookbook contains many useful formulas like the following $$\eqalign{ \frac{\partial\log(\det(X))}{\partial X} &= X^{-T} \quad&\implies\quad&d\log(\det(X)) = X^{-T}:dX \\ \frac{\partial{\,\rm Tr}(X)}{\partial X} &= I \quad&\implies\quad&d{\,\rm Tr}(X) = I:dX \\ }$$ Substituting the given variables $$\eqalign{ A &= I+\theta\theta^T \quad&\implies\quad &dA = (d\theta\,\theta^T+\theta\,d\theta^T) \\ I &= A^{-1}A \quad&\implies\quad &0 = dA^{-1}A + A^{-1}\,dA \\ &&&dA^{-1} = -A^{-1}dA\,A^{-1} \\ \\ (B-C) &= \theta\theta^T + a\theta^T + \theta b^T \quad&\implies\quad&dB = dA + a\,d\theta^T + d\theta\,b^T \\ \\ X &\doteq A^{-1}B \quad&\implies\quad&dX = A^{-1}dB + dA^{-1}B \\ }$$ yields $$\eqalign{ d{\,\rm Tr}(X) &= I:dX \\ &= I:(A^{-1}\,dB - A^{-1}dA\,A^{-1}B) \\ &= A^{-T}:(dA+a\,d\theta^T + d\theta\,b^T) - A^{-T}B^TA^{-T}:dA \\ &= A^{-T}:(a\,d\theta^T + d\theta\,b^T) +(A^{-T}-A^{-T}B^TA^{-T}):(d\theta\,\theta^T+\theta\,d\theta^T) \\ &= \big(A^{-T}b + A^{-1}a\big):d\theta + \big(A^{-T}+A^{-1}-A^{-T}B^TA^{-T}-A^{-1}BA^{-1}\big)\theta:d\theta \\ \\ \frac{\partial{\rm Tr}(X)}{\partial\theta} &= A^{-T}b + A^{-1}a + \Big(A^{-T}+A^{-1}-A^{-T}B^TA^{-T}-A^{-1}BA^{-1}\Big)\theta \\ \\ }$$


In some of the steps above, a colon is used to denote the trace/Frobenius product, i.e. $$\eqalign{ A:B = {\rm Tr}(A^TB) = {\rm Tr}(B^TA) = B:A }$$ The properties of the trace under transposition and cyclic permutation of its argument, allows the terms in such a product to be rearranged in several equivalent ways, e.g. $$\eqalign{ A:BC &= AC^T:B = B^TA:C &= \ldots \\ A:B &= A^T:B^T = I:A^TB &= \ldots \\ }$$