Prove that gradient of $\operatorname{tr}(A \cdot B \cdot A^{T} \cdot C$) with respect to $A = C \cdot A \cdot B$ +$ C^{T} \cdot A \cdot B^{T}$

255 Views Asked by At

I've been poking through SEE's Machine learning notes and am having difficulty proving the relationship:

Gradient, $\nabla$, of the trace $\operatorname{tr}(A \cdot B \cdot A^{T} \cdot C$) with respect to $A = C \cdot A \cdot B$ + $C^{T} \cdot A \cdot B^{T}$

I tried writing the gradient of the trace as a bunch of nested sums but am having difficulty doing it with 4 matrices. I was wondering if there is an easier approach to proving the relationship.

Thanks! S.

1

There are 1 best solutions below

0
On BEST ANSWER

Note that $\text{tr}(AB) = \sum_i \sum_j A_{ij} B_{ji}$ so $\dfrac{\partial}{\partial A_{ij}} \text{tr}(AB) = B_{ji}$, i.e. $\text{grad}_A \text{tr}(AB) = B'$. Thus $ \text{grad}_A \text{tr}(ABDC) = (BDC)'=C'D'B'$, while $\text{grad}_A \text{tr}(DBA'C) = \text{grad}_A \text{tr}(CDBA') = \text{grad}_A \text{tr}(AB'D'C') = (B'D'C')' = CDB$. So (substituting $A'$ for $D$ in the first and $A$ for $D$ in the second), $\text{grad}_A \text{tr}(ABA'C) = C'AB' + CAB$.