Derivative of squared Frobenius norm of a matrix with transpose

141 Views Asked by At

How can I derivative this squared Frobenius norm $||\mathbf A^{T}\mathbf A - \mathbf I||^2_F$, where $\mathbf A$ is a $D \times K$ matrix. I tried to do it by myself and my result is $4(\mathbf A^T \mathbf A - \mathbf I)\mathbf A$. This result is obviously with dimension unmatching error.

Here are my steps: $$\frac{\partial ||\mathbf A^{T}\mathbf A - \mathbf I||^2_F}{\partial \mathbf A} = \frac{trace\{(\mathbf A^{T}\mathbf A - \mathbf I)^T(\mathbf A^{T}\mathbf A - \mathbf I)\}}{\partial \mathbf A}=\frac{\partial trace\{(\mathbf A^{T}\mathbf A - \mathbf I)^2\}}{\partial \mathbf A}=\frac{\partial trace\{(\mathbf A^{T}\mathbf A - \mathbf I)^2\}}{\partial (\mathbf A^{T}\mathbf A - \mathbf I)}\frac{\partial (\mathbf A^{T}\mathbf A - \mathbf I)}{\partial \mathbf A}=2(\mathbf A^{T}\mathbf A - \mathbf I)2\mathbf A = 4(\mathbf A^T \mathbf A - \mathbf I)\mathbf A$$

1

There are 1 best solutions below

0
On

Define the symmetric matrix $$X = A^TA - I$$ Write the norm in terms of this new variable. Then calculate its differential and gradient. $$\eqalign{ \phi &= \|X\|_F^2 \;=\; X:X \\ d\phi &= (X:dX + dX:X) \\ &= 2X:dX \\ &= 2X:d(A^TA) \\ &= 2X:dA^T + 2X:dA \\ &= 2X^T:dA + 2X:dA \\ &= 2(X^T+X):dA \\ &= 4X:dA \\ \frac{\partial \phi}{\partial A} &= 4X \;=\; 4(A^TA-I) \\ }$$ In the above, a colon is used as a convenient product notation for the trace, i.e. $$A:B = {\rm Tr}(A^TB)$$ The properties of the trace allow the terms in the product to be rearranged in several ways. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ A:BC &= B^TA:C \;=\; AC^T:B \;=\; \ldots \\ }$$