A simple Derivative of a determinant with respect to a matrix

69 Views Asked by At

My question is pretty simple. I wish to solve: $\frac{d\ln |\mathbf\Sigma^{-1}+\mathbf{B}|}{d \mathbf\Sigma}$. Here $\mathbf{B}$ is a diagonal matrix and also irrelevant to matrix $\mathbf{\Sigma}$. Can anybody help? Thanks!

2

There are 2 best solutions below

2
On

In order to solve the question, we should use the following properties $$\frac{{\partial {{\bf{A}}^{ - 1}}}}{{\partial x}} = - {{\bf{A}}^{ - 1}}\frac{{\partial {\bf{A}}}}{{\partial x}}{{\bf{A}}^{ - 1}}$$ $$ \frac{{\partial {\bf{AB}}}}{{\partial x}} = \frac{{\partial {\bf{A}}}}{{\partial x}}{\bf{B}} + {\bf{A}}\frac{{\partial {\bf{B}}}}{{\partial x}}$$ $$ \frac{{\partial \ln \left| {\bf{A}} \right|}}{{\partial x}} = {\rm{Tr}}\left( {{{\bf{A}}^{ - 1}}\frac{{\partial {\bf{A}}}}{{\partial x}}} \right)$$ as well as the matrix inversion equation $${({\bf{A}} + {\bf{B}}{{\bf{D}}^{ - 1}}{\bf{C}})^{ - 1}} = {{\bf{{ A}}}^{ - 1}} - {{\bf{{ A}}}^{ - 1}}{\bf{B}}{({\bf{D}} + {\bf{C}}{{\bf{{ A}}}^{ - 1}}{\bf{B}})^{ - 1}}{\bf{C}}{{\bf{{ A}}}^{ - 1}}$$ We can derive step by step as follows: \begin{array}{l}\begin{aligned} \frac{{{\rm{d}}\ln \left| {{{\bf \Sigma} ^{ - 1}} + {\bf B}} \right|}}{{{\rm{d}}{{\bf \Sigma} _{ij}}}} &= - \frac{{{\rm{d}}\ln \left| {{{({{\bf \Sigma} ^{ - 1}} + {\bf B})}^{ - 1}}} \right|}}{{{\rm{d}}{{\bf \Sigma} _{ij}}}}\\ &= - {\rm{Tr}}\left( {({{\bf \Sigma} ^{ - 1}} + {\bf B})\frac{\partial }{{\partial {{\bf \Sigma} _{ij}}}}{{({{\bf \Sigma} ^{ - 1}} + {\bf B})}^{ - 1}}} \right)\\ &= - {\rm{Tr}}\left( {({{\bf \Sigma} ^{ - 1}} + {\bf B})\frac{\partial }{{\partial {{\bf \Sigma} _{ij}}}}\left( {{\bf \Sigma} - {\bf \Sigma} {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} } \right)} \right)\\ &= - {\rm{Tr}}\left( {({{\bf \Sigma} ^{ - 1}} + {\bf B})\left( {{{\bf I}_{ij}} - {{\bf I}_{ij}}{{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} - {\bf \Sigma} {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{{\bf I}_{ij}} + {\bf \Sigma} {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{{\bf I}_{ij}}{{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} } \right)} \right)\\ &= - {\left\{ {{{\bf \Sigma} ^{ - 1}} + {\bf B} - 2{{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} ({{\bf \Sigma} ^{ - 1}} + {\bf B}) + {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} ({{\bf \Sigma} ^{ - 1}} + {\bf B}){\bf \Sigma} {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}} \right\}_{ji}}\\ &= - {\left\{ {\left( {{\bf I} - {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} } \right)({{\bf \Sigma} ^{ - 1}} + {\bf B})\left( {{\bf I} - {{({{\bf B}^{ - 1}} + {\bf \Sigma} )}^{ - 1}}{\bf \Sigma} } \right)} \right\}_{ji}}\\ &= - {\left\{ {{{\bf \Sigma} ^{ - 1}}{{({{\bf \Sigma} ^{ - 1}} + {\bf B})}^{ - 1}}({{\bf \Sigma} ^{ - 1}} + {\bf B}){{({{\bf \Sigma} ^{ - 1}} + {\bf B})}^{ - 1}}{{\bf \Sigma} ^{ - 1}}} \right\}_{ji}}\\ &= - {\left\{ {{{({\bf \Sigma} + {\bf \Sigma} {\bf B}{\bf \Sigma} )}^{ - 1}}} \right\}_{ji}} \end{aligned}\end{array} Note that ${\bf \Sigma}$ and ${\bf B}$ are all symmetric matrices so the multiplication between them satisfy the commutative law. Finally we can draw the following conclusion: $$\frac{{{\rm{d}}\ln \left| {{{\bf \Sigma} ^{ - 1}} + {\bf B}} \right|}}{{{\rm{d}}{{\bf \Sigma} }}} = -{({\bf \Sigma} + {\bf \Sigma} {\bf B}{\bf \Sigma} )^{ - 1}}$$

0
On

Let me use $M$ instead of $\Sigma;\,$ it's easier to type.

First, define a new matrix variable $$\eqalign{ A &= M^{-1}+B \cr dA &= dM^{-1} = -M^{-1}\,dM\,M^{-1} \cr }$$ Write the function in terms of this new variable, then find its differential and gradient $$\eqalign{ \phi &= \log\det A \cr d\phi &= d\log\det A \cr &= A^{-T}:dA \cr &= -A^{-T}:M^{-1}\,dM\,M^{-1} \cr &= -M^{-T}A^{-T}M^{-T}:dM \cr &= -(MAM)^{-T}:dM \cr &= -(M + MBM)^{-T}:dM \cr \frac{\partial\phi}{\partial M} &= -(M + MBM)^{-T} \cr\cr }$$ Some of the steps above use a colon to denote the trace/Frobenius product, i.e. $$A:BC={\rm tr}(A^TBC)$$ The properties of the trace give rise to lots of ways to rearrange the terms in the product. For example, all of the following are equivalent $$\eqalign{ A:BC &= BC:A \cr &= A^T:(BC)^T \cr &= AC^T:B \cr &= B^TA:C \cr }$$