Derivative of log determinant

379 Views Asked by At

It is known that $$ \frac{\partial}{\partial X} \operatorname{ln} | \operatorname{det}(X)| = (X^{-1})^{T} = (X^{T})^{-1}. $$ (The Matrix Cookbook, Petersen, K., 2012)

Let $$ A \in R^{p \times p}, ~ \mu = (\mu_{1}, \cdots, \mu_{p})^{T} \in R^{p}. $$ Moveover, for $v : R \rightarrow R$, let $V(\mu) = diag(v(\mu_{k})) \in R^{p \times p}.$ Then, I want to obtain the following derivative: $$ \frac{\partial}{\partial \mu} \operatorname{ln} |\operatorname{det}[ V(\mu)\{I_{p} - A V(\mu)\}]|. $$

I tried $$ \frac{\partial V(\mu)}{\partial \mu} \frac{\partial}{\partial V(\mu)} \operatorname{ln} |\operatorname{det}[ V(\mu)\{I_{p} - A V(\mu)\}]|, $$ but failed...

1

There are 1 best solutions below

1
On BEST ANSWER

Given a scalar function $f:{\mathbb R}\to{\mathbb R},\;$ its derivative $f'=\frac{df}{dx}$ and the following definitions $$\eqalign{ v &= f(\mu),\quad &v' = f'(\mu) \qquad &\big({\rm functions\,applied\,elementwise}\big) \\ V &= {\rm Diag}(v),\quad &V' = {\rm Diag}(v') \\ \phi &= \log\big|\!\det(V-&VAV)\big| \\ }$$ Then the gradient of the $\phi$-function wrt $\mu$ is $$\eqalign{ \frac{\partial\phi}{\partial\mu} &= V'\operatorname{diag}\Big(X^{-1}-AVX^{-1}-X^{-1}VA\Big) \\ }$$ where the diag() function returns the main diagonal of its matrix argument as a column vector, while the Diag() function creates a diagonal matrix from a vector argument.

The derivation of this gradient follows.


The differential of the element-wise vector function is $\; dv = V'\,d\mu$

Define the matrix $\;X = (V-VAV),\;$ write the $\phi$-function in terms of it, utilize the result from the Matrix Cookbook, and then perform a change of variables from $X\to V\to v\to\mu$. $$\eqalign{ \phi &= \log(\det(X)) \\ d\phi &= X^{-T}:dX \\ &= X^{-T}:\big(dV-dV\,AV-VA\,dV\big) \\ &= \big(X^{-T}-X^{-T}(AV)^T-(VA)^TX^{-T}\big):dV \\ &= \big(X^{-1}-AVX^{-1}-X^{-1}VA\big):dV \\ &= \big(X^{-1}-AVX^{-1}-X^{-1}VA\big):{\rm Diag}(dv) \\ &= {\rm diag}\big(X^{-1}-AVX^{-1}-X^{-1}VA\big):dv \\ &= {\rm diag}\big(X^{-1}-AVX^{-1}-X^{-1}VA\big):V'\,d\mu \\ &= V'\operatorname{diag}\Big(X^{-1}-AVX^{-1}-X^{-1}VA\Big):d\mu }$$ which recovers the gradient shown above.


In the steps above, a colon denotes the trace/Frobenius product, i.e. $$A:B = {\rm Tr}(A^TB)$$ The cyclic property of the trace allows the terms in such a product to be rearranged in many different ways, e.g. $$\eqalign{ A:B &= A^T:B^T &= B:A \\ A:BC &= B^TA:C &= AC^T:B \;= I:A^TBC \;= etc \\ }$$ Several steps also made use of the fact that $(V,V')$ are symmetric matrices.