I'm reading Tipping and Bishop's paper on Probabilistic Pricipal Component Analysis. To get the MLE estimate (variables $W \in \Re^{d \times q}$ and $\sigma^2 \in \Re^+$). $N, d , q \in \Re$ and $S \in \Re^{d \times d}$ is the Sample Covaraice Matrix, are constants.
$$\mathcal{L} = \frac{N}{2}\{d \ln(2\pi) + \ln(|C|) + tr(C^{-1}S)\}$$ $$\text{where } C = WW^T + \sigma^2I$$ They set, $$\frac{\partial \mathcal{L}}{\partial W} = 0$$ and obtain eqn $(10)$ on page $619$, stating some "standard matrix calculus" results were used. I couldn't find any "direct" results in this book. Can someone please point me to a relevant resource, for this heavy Matrix Calculus.
Formulas (57) and (63) from the 'Cookbook' gives you the matrix derivative of $\operatorname{logdet}(\mathbf{C})$ and $\operatorname{tr}(\mathbf{C}^{-1} \mathbf{S})$
Using the fact that matrices are symmetric, this yields the differential \begin{eqnarray} d\phi &=& -\frac{N}{2} \left[ \mathbf{C}^{-1} - \mathbf{C}^{-1} \mathbf{S} \mathbf{C}^{-1} \right] : d\mathbf{C} \\ &=& N \left[ \mathbf{C}^{-1} \mathbf{S} \mathbf{C}^{-1} - \mathbf{C}^{-1} \right]\mathbf{W} : d\mathbf{W} \end{eqnarray} where the symbol $:$ stands for the Frobenius inner product.
UPDATE: By definition, $\mathbf{C}= \mathbf{W}\mathbf{W}^T+\sigma^2 \mathbf{I}$. We obtain the relation $ d\mathbf{C}= 2(d\mathbf{W})\mathbf{W}^T $. The passage to the second line uses the useful property of the Frobenius inner product $\mathbf{A}:\mathbf{B}\mathbf{C}^T =\mathbf{A}\mathbf{C}:\mathbf{B}$.
Finally the gradient is $$ \frac{\partial \phi}{\partial \mathbf{W}} = N \left[ \mathbf{C}^{-1} \mathbf{S} \mathbf{C}^{-1} - \mathbf{C}^{-1} \right]\mathbf{W} $$