Gradient of $\text{tr}\left[(\bf{F}^T \bf{F})^{-1} \bf{S}\right]$ with respect to the columns of F?

53 Views Asked by At

Suppose the $k \times p$ matrix $\pmb{F}$ has columns $\pmb{f}_1, ..., \pmb{f}_{p}.$ For each $j \in \{1, ..., p\},$ how can I compute the gradient $$\nabla_{\pmb{f}_j} \text{tr}\left[(\pmb{F}^T \pmb{F})^{-1} \pmb{S}\right]$$ where $\pmb{S}$ is a $p \times p$ matrix not depending on elements of $\pmb{F}?$ I could find the gradient if the inverse weren't there, but I'm not sure how to calculate it with the inverse present. Any suggestions?

As an alternative way of expressing the question, let $g(\pmb{F}) = \text{tr}\left[(\pmb{F}^T \pmb{F})^{-1} \pmb{S}\right].$ Is there a nice expression for the $k \times p$ matrix of partial derivatives with entry $(i,j)$ equal to the partial derivative of the function $g$ with respect to the $(i,j)$ entry of $\pmb{F}?$

1

There are 1 best solutions below

0
On BEST ANSWER

Write the objective function in terms of inner/Frobenius product (which I'll denote by a colon) $$\eqalign{ \lambda &= {\rm tr}((F^TF)^{-1}S) \cr &=S:(F^TF)^{-1} \cr &=S:M^{-1} \cr }$$

Now find the differential and gradient of this function $$\eqalign{ d\lambda &= -S:M^{-1}\,dM\,M^{-1} \cr &= -M^{-1}SM^{-1}:dM \cr &= -M^{-1}SM^{-1}:(dF^T\,F+F^T\,dF) \cr &= -M^{-1}(S+S^T)M^{-1}:F^T\,dF \cr &= -FM^{-1}(S+S^T)M^{-1}:dF \cr \cr G=\frac{\partial\lambda}{\partial F} &= -FM^{-1}(S+S^T)M^{-1} \cr &= -F(F^TF)^{-1}(S+S^T)(F^TF)^{-1} \cr \cr }$$ The gradient wrt the $j^{th}$ column of $F$, is just the $j^{th}$ column of $G$.