What's the third derivative of $\log( \det X)$ with respect to $X$?

201 Views Asked by At

If $X$ is positive definite matrix, what is the third derivative of $\log(\det X)$ with respect to $X$?

We know the first derivative of $\log(\det X)$ is $X^{-1}$ and the second derivative (hessian) is $X^{-1} \otimes X^{-1}$ but I don't know the third derivative of $\log(\det X)$ with respect to $X$.

Thanks in advance.

2

There are 2 best solutions below

1
On

Let $H$ be a symmetric matrix with "small" entries. Then by writing $\Delta = X^{-1}H$ for simplicity, we may decompose $\log\det(X+H)$ as

$$ \log\det(X+H) = \log\det(X) + \log\det(I+\Delta). $$

Now by using the identities $A=\exp(\log A)$ and $\det\exp(A) = \exp\operatorname{tr}(A)$ (which surely hold for positive definite matrices), we get

\begin{align*} \log\det(I+\Delta) &= \log\det \exp(\log(I+\Delta)) \\ &= \log\det \exp\left(\sum_{k=1}^{\infty}\frac{(-1)^{k-1}}{k} \Delta^k \right) \\ &= \operatorname{tr}\left(\sum_{k=1}^{\infty}\frac{(-1)^{k-1}}{k} \Delta^k \right) \\ &= \sum_{k=1}^{\infty}\frac{(-1)^{k-1}}{k} \operatorname{tr}(\Delta^k). \end{align*}

  • For example, the first derivative of $\log\det(X)$ is the linear functional

    $$ H \mapsto \operatorname{tr}(X^{-1}H), $$

    which can be identified as $X^{-1}$ via the Riesz representation theorem, assuming that the space of symmetric matrices is furnished with the inner product $\langle A, B \rangle = \operatorname{tr}(A^{\mathsf{T}}B)$.

  • Then the second derivative of $\log\det(X)$ is the quadratic form

    $$ H \mapsto -\operatorname{tr}(X^{-1}HX^{-1}H), $$

    which then induces the bilinear form

    $$ (U, V) \mapsto -\operatorname{tr}(X^{-1}UX^{-1}V) $$

    via polarization. (I am not sure how this can be identified with $-X^{-1}\otimes X^{-1}$, though.)

  • Finally, the third derivative of $\log\det(X)$ is the cubic form

    $$ H \mapsto 2\operatorname{tr}(X^{-1}HX^{-1}HX^{-1}H). $$

1
On

Let $$\eqalign{ X &= X^T \\ Y &= Y^T = X^{-1} \\ f(X) &= \log\det(X) = -\log\det(Y) \\ }$$ then in index notation the first few derivatives of $f$ are $$\eqalign{ \frac{\partial f}{\partial X_{ij}} &= Y_{ij} \\ \frac{\partial^2 f}{\partial X_{ij}\,\partial X_{k\ell}} &= ​-Y_{i\ell}Y_{jk} \\ \frac{\partial^3 f}{\partial X_{ij}\,\partial X_{k\ell}\,\partial X_{pq}} &​= Y_{ip}Y_{q\ell}Y_{jk} + Y_{i\ell}Y_{jp}Y_{qk} \\\\ }$$ These formula result from the repeated application of two simple rules $$\eqalign{ dY_{i\ell} &= -\sum_{j}\sum_{k}Y_{ij}\;dX_{jk}\,Y_{k\ell} \qquad\qquad \frac{\partial X_{ij}}{\partial X_{k\ell}} &= \delta_{ik}\delta_{j\ell} \\ }$$