How to reconcile these two Jacobi formuli

Question

How to reconcile these two Jacobi formuli

196 Views Asked by Bumbble Comm At 31 Mar 2026 - 8:53

I understand

(1). $\dfrac {\partial }{\partial A}\det \left( \textbf{f}(\left( A\right) \right) = \det \left( \textbf{f}(\left( A\right) \right) tr\left( \textbf{f}(\left( A\right) ^{-1}\dfrac {d\textbf{f}(A)}{\partial A}\right) $ via Jacobi's Formula. Here A $\in \mathbb{R} ^{m\times n}$.

I also know as a special case ${\displaystyle {\partial \det(A) \over \partial A_{ij}}=\operatorname {adj} ^{\rm {T}}(A)_{ij}.} = det(A)(A)^{-T}_{ij}$ So $\dfrac {\partial \det \left( A\right) }{\partial A}=\det \left( A\right) A^{-T}$

But when i try and get this result from (1). where $\textbf{f}\left( A\right)$ = A. I get $\dfrac {\partial \det \left( A\right) }{\partial A}=\det \left( A\right) tr(A^{-1})$ which doesn't marry up?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Jacobi's formula says

$$\frac{d}{d t} \operatorname{det} A(t)=\operatorname{tr}\left(\operatorname{adj}(A(t)) \frac{d A(t)}{d t}\right)$$

for a matrix $A$ that depends on a scalar value $t$. However in your case you have $f(A)$ which is presumably a matrix valued function depending on a matrix. So it is not directly applicable here. We should have $A\in\mathbb R^{n\times n}$, $f\colon\mathbb R^{n\times n}\to\mathbb R^{n\times n}$ and not $m\times n$, else $\det(f(A))$ doesn't make sense necessarily.

By the chain rule we have

$$ \frac{\partial \det(f(A))}{\partial A} = \frac{\partial \det(f(A))}{\partial f(A)}\circ\frac{\partial f(A)}{\partial A} $$

Note that I write "$\circ$" instead of "$\cdot$" since in fact the chain rule says that the derivative of a composition of functions is equal to the composition of the derivatives. Since linear functions correspond to matrices, and matrix multiplication is equivalent to composition of linear functions people usually write "$\cdot$" instead. However this becomes troublesome once we want to do derivatives w.r.t. to matrices since we can encounter higher order tensors , such as the 4-th order tensor $\frac{\partial f(A)}{\partial A}$, for which matrix multiplication doesn't make sense anymore. Instead, we need to think about more general tensor contractions. (you can keep using "$\cdot$" if you are aware it means tensor contraction). In this specific case we have

$\frac{\partial \det(f(A))}{\partial A}$ is a second order tensor $ \frac{\partial \det(f(A))}{\partial A} = \Big(\frac{\partial \det(f(A))}{\partial A_{kl}}\Big)_{kl}$
$\frac{\partial \det(f(A))}{\partial f(A)}$ is a second order tensor $ \frac{\partial \det(f(A))}{\partial f(A)} = \Big(\frac{\partial \det(f(A))}{\partial f(A)_{ij}}\Big)_{ij}$
$\frac{\partial f(A)}{\partial A} $ is a fourth order tensor $ \frac{\partial f(A)}{\partial A} = \Big(\frac{\partial f(A)_{ij}}{\partial A_{kl}} \Big)^{ij}_{kl}$

And they are combined as

$$ \frac{\partial \det(f(A))}{\partial A_{kl}} = \sum_{ij} \frac{\partial \det(f(A))}{\partial f(A)_{ij}} \frac{\partial f(A)_{ij}}{\partial A_{kl}} \qquad(1)$$

Long story short, we have $\frac{\partial \det(f(A))}{\partial f(A)} =\operatorname{adj}(f(A))^T$, however we need to be careful of how it is composed with $\frac{\partial f(A)}{\partial A}$. If $f(A)=A$ then

$$\frac{\partial A}{\partial A} = \Big(\frac{\partial A_{ij}}{\partial A_{kl}}\Big)^{ij}_{kl} = (\delta_{ik}\delta_{jl})^{ij}_{kl} = I\otimes I$$

is the identity tensor, as one would expect.

Examples:

$f(A)=A$ then $\frac{\partial f(A)}{\partial A} = I\otimes I$ and $\frac{\partial \det(f(A))}{\partial A}= \operatorname{adj}(A)^T$
$f(A)=A^{-1}$ then $\frac{\partial f(A)}{\partial A} = - A^{-T}\otimes A^{-1}$ and $\frac{\partial \det(f(A))}{\partial A}= - A^{-T}\operatorname{adj}(A^{-1})^TA^{-T}$
$f(A) = BA$ then $\frac{\partial f(A)}{\partial A} = I \otimes B$ and $\frac{\partial \det(f(A))}{\partial A}= B^T\operatorname{adj}(BA)^T$
$f(A) = AB$ then $\frac{\partial f(A)}{\partial A} = B^T\otimes I$ and $\frac{\partial \det(f(A))}{\partial A}= \operatorname{adj}(AB)^T B^T$

In particular we have as a general rule:

$$\frac{\partial f(A)}{\partial A} = U\otimes V \implies \frac{\partial \det(f(A))}{\partial A} = V^T\operatorname{adj}(f(A))^T U$$

Because plugging $U\otimes V = (U_{ik}V_{jl})^{ij}_{kl}$ into $(1)$ yields $$ \frac{\partial \det(f(A))}{\partial A_{kl}} = \sum_{ij} C_{ij}U_{ik}V_{jl} \implies \frac{\partial \det(f(A))}{\partial A} = U^T C V$$

EDIT: Actually here I get the transposed version because of a different layout convention, but you get the point. A very useful resource to check and verify such computations is the website http://www.matrixcalculus.org/ which to my knowledge is the only CAS that can perform this kind of matrix calculus.

EDIT 2: Ok so the book you cited in turn references the matrix cook book which is a itself just a formula collection. The identity you cite is presented there only in differential form

$$ \partial(\operatorname{det}(\mathbf{X}))=\operatorname{det}(\mathbf{X}) \operatorname{Tr}\left(\mathbf{X}^{-1} \partial \mathbf{X}\right)$$

However right of the start, you should not use this identity ever because it only works when $X$ is invertible. Instead, one should use

$$\partial(\operatorname{det}(\mathbf{X}))=\operatorname{tr}(\operatorname{adj}(\mathbf{X}) \partial \mathbf{X}) = \operatorname{adj}(\mathbf{X})^T \cdot \partial \mathbf{X}$$

Since the adjugate always exists (*). Noting that again both the trace and "$\cdot$" here are actually more general tensor-contractions and not just the standard matrix trace/matrix multiplication, this formula is equivalent to (1).

(*) Keep in mind that when you implement it, you don't want to actually compute inverses or adjugates but instead replace them with calls to linear system solver.

Obviously both notations are not optimal since they do not tell use explicitly over which axes we have to contract. If you need this information as well you'll have to stick to Einstein index notation or use something more exotic like what was suggested in this paper: https://arxiv.org/abs/1208.0197 (which actually helped me a lot to clear up some confusion although I do not use the suggested notation myself)

Remark: As a side note: the reason why I think traces should be avoided is 2-fold. On the one hand traces are tensor-contractions hence it is sort of duplicate notation. Also I have seen way to often that people actually implement $tr(A^TB)$, which is extremely inefficient as you compute the whole matrix product, but only need the diagonal entries.

Remark 2: By the way, the Frobenius product is nothing but the induced inner product on $\mathbb R^m\otimes \mathbb R^n$, cf. https://en.m.wikipedia.org/wiki/Tensor_product_of_Hilbert_spaces (consider a matrix as $A=\sum_{i=1}^m\sum_{j=1}^n A_{ij} e_i \otimes e_j$)

How to reconcile these two Jacobi formuli

There are 1 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in VECTOR-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions