Suppose the $k \times p$ matrix $\pmb{F}$ has columns $\pmb{f}_1, ..., \pmb{f}_{p}.$ For each $j \in \{1, ..., p\},$ how can I compute the gradient $$\nabla_{\pmb{f}_j} \text{tr}\left[(\pmb{F}^T \pmb{F})^{-1} \pmb{S}\right]$$ where $\pmb{S}$ is a $p \times p$ matrix not depending on elements of $\pmb{F}?$ I could find the gradient if the inverse weren't there, but I'm not sure how to calculate it with the inverse present. Any suggestions?
As an alternative way of expressing the question, let $g(\pmb{F}) = \text{tr}\left[(\pmb{F}^T \pmb{F})^{-1} \pmb{S}\right].$ Is there a nice expression for the $k \times p$ matrix of partial derivatives with entry $(i,j)$ equal to the partial derivative of the function $g$ with respect to the $(i,j)$ entry of $\pmb{F}?$
Write the objective function in terms of inner/Frobenius product (which I'll denote by a colon) $$\eqalign{ \lambda &= {\rm tr}((F^TF)^{-1}S) \cr &=S:(F^TF)^{-1} \cr &=S:M^{-1} \cr }$$
Now find the differential and gradient of this function $$\eqalign{ d\lambda &= -S:M^{-1}\,dM\,M^{-1} \cr &= -M^{-1}SM^{-1}:dM \cr &= -M^{-1}SM^{-1}:(dF^T\,F+F^T\,dF) \cr &= -M^{-1}(S+S^T)M^{-1}:F^T\,dF \cr &= -FM^{-1}(S+S^T)M^{-1}:dF \cr \cr G=\frac{\partial\lambda}{\partial F} &= -FM^{-1}(S+S^T)M^{-1} \cr &= -F(F^TF)^{-1}(S+S^T)(F^TF)^{-1} \cr \cr }$$ The gradient wrt the $j^{th}$ column of $F$, is just the $j^{th}$ column of $G$.