Complicated trace derivative

198 Views Asked by At

Given a symmetric matrix $Y$ and matrices $Z$ and $X$ what is the derivative in $Z$ of the trace $$ \text{tr}( (XX^T-YZZ^TY)^T (XX^T-YZZ^TY) )? $$ I have looked all over for straightforward ways of computing this, but all I can think of seems to require very long, messy calculations. This shouldn't matter, but the matrix dimensions are $Z_{N \times k}$, $Y_{N \times N}$, $X_{N \times N}$. This is coming from an optimization problem, and I'm expecting the solution to be something to the effect of $ZZ^T=XX^T$.

1

There are 1 best solutions below

0
On

For convenience define $M = XX^T-YZZ^TY$, and instead of using the trace, write the function in terms of the Frobenius product as $\,f=\|M\|^2_F=M:M$

Find the differential $$\eqalign{ df &= 2\,M:dM \cr &= -2\,M:Yd(ZZ^T)Y \cr &= -2\,YMY:(dZZ^T+ZdZ^T) \cr &= -2\,YMY:2\,{\rm sym}(dZZ^T) \cr &= -4\,{\rm sym}(YMY) : dZZ^T\cr &= -4\,YMY : dZZ^T\cr &= -4\,YMYZ : dZ\cr &= -4\,Y(XX^T-YZZ^TY)YZ : dZ\cr }$$ Since $df=\frac{\partial f}{\partial Z} : dZ\,\,$ the derivative is seen to be $$\eqalign{ \frac{\partial f}{\partial Z} &= -4\,Y(XX^T-YZZ^TY)YZ \cr }$$ As a reminder, the basic rules for manipulating a Frobenius product are $$\eqalign{ A:BC &= AC^T:B \cr A:BC &= B^TA:C \cr A:B &= B:A \cr {\rm sym}(A):B &= A:{\rm sym}(B) \cr {\rm skew}(A):B &= A:{\rm skew}(B) \cr }$$ all of which can be verified using the trace equivalence ${\rm tr}(A^TB) = A:B$