Can't find derivative of det(X'AX).
I try use chain rule like $$\nabla_X f = det(X'AX) (X'AX)^{-T} [X'AX]$$ as $\nabla_X detX = detX \cdot X^{-T}$, but last two multipliers lead to 4th order tensors and I don't know where I made a mistake.
Can't find derivative of det(X'AX).
I try use chain rule like $$\nabla_X f = det(X'AX) (X'AX)^{-T} [X'AX]$$ as $\nabla_X detX = detX \cdot X^{-T}$, but last two multipliers lead to 4th order tensors and I don't know where I made a mistake.
Copyright © 2021 JogjaFile Inc.
Define a new matrix variable $$\eqalign{ \def\c#1{\color{red}{#1}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\sym#1{\op{sym}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\qiq{\quad\implies\quad} Y &\equiv X^TAX \\ dY &= \LR{X^TA\;dX+dX^TAX} \\ }$$ Then differentiate the function $$\eqalign{ f &= \det(Y) \\ df &= f\; Y^{-T}:dY \\ &= f\;Y^{-T}:\LR{X^TA\;dX+dX^TAX} \\ &= f\;Y^{-T}:\LR{X^TA\;dX} + \LR{f\;Y^{-T}}^T:\LR{X^TA^T\,dX}^T \\ &= f\;Y^{-T}:\LR{X^TA\;dX} + f\;Y^{-1}:\LR{X^TA^T\,dX} \\ &= \LR{f\;A^TXY^{-T}}:dX + \LR{f\;AXY^{-1}}:dX \\ &= \LR{f\;A^TXY^{-T}+f\;AXY^{-1}}:dX \\ \grad{f}{X}\; &= \;{f\;A^TXY^{-T}+f\;AXY^{-1}} \\ }$$ where the matrix inner product has been denoted by a colon $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \} \\ A:B &= B:A \;=\; B^T:A^T \\ C:\LR{AB} &= A:\LR{CB^T} \;=\; B:\LR{A^TC} \\ }$$
In Matrix Calculus the chain rule is impractical because it involves higher-order tensors. These tensors have no intrinsic interest, they are just quantities that must be calculated in order to apply the chain rule.
Furthermore, these tensors are awkward to write using standard matrix notation and require the introduction of special contraction products in order to effectively manipulate them.
By contrast, the differential of a matrix behaves like a matrix. In particular, it obeys all of the rules of Matrix Algebra.