I am working with function (and their derivatives) of matrices, in particular of symmetric and positive definite (SPD) matrices. I am interested in keeping a matrix notation as long as possible, instead of using vectorization.
The specific problem that I am facing now is with the Kronecker product and how to manage the object that it produces. Specifically, I obtained (some context will be provided below) this product (note the explicit indexes): $$ A_{ik}A_{jl} - B_{ik}A_{jl} - B_{il}A_{jk} $$ where $A$ and $B$ are squared, with the same dimension $p$ and SPD matrices. My question is: can I rewrite this object as a Kronecker product? If yes, it exists some way to clean up the two second terms? At the first question, I would be tempted to write something along $$ (A \otimes A + B \otimes A)_{p(i-1)+j, p(k-1)+l} - (B \otimes A)_{p(i-1)+j, p(l-1)+k}$$ (I am following the wikipedia page here). Is it possible to obtain a cleaner expression?
Context:
What I am trying to do is to obtain the Hessian of the Loglikelihood for a Multivariate Normal distribution. This in practice requires to compute the derivatives of: $$ l = -\frac{n}{2}\log\det\left|\Sigma\right| - \frac{1}{2}\text{Tr}\left[S\Sigma^{-1}\right] $$ with $S=XX^\top$ a symmetric matrix obtained from the data (in this setting I am considering the mean equal to $0$) and $\Sigma$ the covariance matrix.
I have to compute $\partial l/\partial\Sigma$ and then $\partial^2l/(\partial\Sigma\partial\Sigma)$. If I have not made mistakes, they are: $$ \frac{\partial l}{\partial\Sigma} = -\frac{n}{2}\Sigma^{-1} + \frac{1}{2}\Sigma^{-1}S\Sigma^{-1}$$ and $$ \frac{\partial^2 l}{\partial\Sigma_{ij}\partial\Sigma_{kl}} = \Sigma^{-1}_{ik}\Sigma^{-1}_{jl} - (\Sigma^{-1}S\Sigma^{-1})_{ik}\Sigma^{-1}_{jl} - (\Sigma^{-1}S\Sigma^{-1})_{il}\Sigma^{-1}_{jk}$$ which is the expression that I wrote at the beginning of the question (with $A=\Sigma^{-1}$ and $B=\Sigma^{-1}S\Sigma^{-1}$).
Thus, if you notice some error here that could save my day! :D
Disclaimer
I know that what I am trying to do can be obtained more easily by using the vec and vech operators to work with standard vectors, but I would really prefer to keep the matrix notation for as long as possible. If that was not possible, I will be forced to transform the matrices (and I mostly know how to do it, but in any case that would be another question), but I hope to avoid it.
Thank y'all for the help!
For ease of typing, define $$\eqalign{ M &= \Sigma^{-1} \;\implies\; dM = -M\,d\Sigma\,M }$$ Your gradient is correct, so let's start with that and find its differential. $$\eqalign{ G &= -\tfrac{1}{2} (nM-MSM) \\ dG &= -\tfrac{1}{2} (n\,dM-dM\,SM-MS\,dM) \\ &= +\tfrac{1}{2} (n\,M\,d\Sigma\,M-M\,d\Sigma\,M\,SM-MSM\,d\Sigma\,M) \\ &= +\tfrac{1}{2} (n\,M\,d\Sigma\,M-M\,d\Sigma\,(2G+nM)-(2G+nM)\,d\Sigma\,M) \\ &= -\tfrac{1}{2} (n\,M\,d\Sigma\,M+2M\,d\Sigma\,G+2G\,d\Sigma\,M) \\ }$$ At this point, we'd normally use vec/vech operations, but you don't want to do that.
So let's introduce the double-dot product between tensors $$\eqalign{ A={\cal B}:C \;\implies\; A_{ij}= \sum_{k,l} {\cal B}_{ijkl}C_{kl} \\ }$$ Let's also introduce the 4th order isotropic tensor ${\cal E}$ with components ${\cal E}_{ijkl} = \delta_{ik}\delta_{jl}$
This tensor is the identity for the double-dot product, i.e. $\;A:{\cal E}= A = {\cal E}:A$
Another useful property is untangling matrix products $\implies A\,dX\,B = A{\cal E}B^T:dX$
Continuing from before $$\eqalign{ dG &= -\tfrac{1}{2} \big(n\,M{\cal E}M+2M{\cal E}G+2G{\cal E}M\big):d\Sigma \\ {\cal H} = \frac{\partial G}{\partial \Sigma} &= -\tfrac{1}{2} \big(n\,M{\cal E}M+2M{\cal E}G+2G{\cal E}M\big) \\ {\cal H}_{ijkl} = \frac{\partial G_{ij}}{\partial \Sigma_{kl}} &= -\tfrac{1}{2} \big(n\,M_{ip}{\cal E}_{pjkq}M_{ql} + 2M_{ip}{\cal E}_{pjkq}G_{ql} + 2G_{ip}{\cal E}_{pjkq}M_{ql}\big) \\ &= -\tfrac{n}{2}M_{ik}M_{jl} -M_{ik}G_{jl} -G_{ik}M_{jl} \\ }$$ I think it looks better with the $G$'s but you can eliminate them in favor of $S,M,\pm$ signs, and more indices. $$\eqalign{ {\cal H}_{ijkl} &= \tfrac{1}{2} \big( n\,M_{ik}M_{jl} - M_{ik}M_{jp}S_{pq}M_{ql} - M_{ip}S_{pq}M_{qk}M_{jl}\big) \\ &= \tfrac{1}{2} \big( n\,M_{ik}M_{jl} - M_{ik}(MSM)_{jl} - (MSM)_{ik}M_{jl}\big) \\ }$$