In the book Mathematics for Machine Learning (chapter on Gaussian Mixture Models), I got stuck trying to figure out how they came up with
$$\frac{\partial \det(\mathbf{\Sigma})^{\frac{-1}{2}}}{\partial \mathbf{\Sigma}} = \frac{-1}{2}\det(\mathbf{\Sigma})^\frac{-1}{2} \mathbf{\Sigma}^{-1}$$
where $\mathbf{\Sigma}$ is the covariance matrix. The book mentions that the following identity was used: $$\frac{\partial \det(f(\mathbf{X}))}{\partial \mathbf{X}} = \det(f(\mathbf{X}))Trace\left( f(\mathbf{X})^{-1} \frac{\partial f(\mathbf{X})}{\partial \mathbf{X}} \right)$$
But the question is, how did they use it to arrive at the first equation ? I can't seem to figure it out. Been stuck for quite a while.
Thanks!
It's probably easier to work with the logarithmic form of the Jacobi formula $$\eqalign{ \frac{\partial \log\det X}{\partial X} &= X^{-T} &\quad\big({\rm assuming\;}\det X > 0\big) \\ }$$ and to use $S=\Sigma\;$ for ease of typing. $$\eqalign{ c &= (\det S)^{-1/2} &\qquad\qquad&\big({\rm Cost\;function}\big) \\ \log(c) &= -\tfrac 12\log(\det S)\\ \frac{1}{c}\frac{\partial c}{\partial S} &= -\tfrac 12S^{-T} &\qquad&\big({\rm Logarthmic\;derivative=Jacobi}\big) \\ \frac{\partial c}{\partial S} &= -\tfrac 12cS^{-1} &\qquad&\big(S\,{\rm \,is\,symmetric}\big) \\ }$$