Derivative of squared norm of component of a matrix perpendicular to identity matrix, with respect to the original matrix

123 Views Asked by At

Let $J\in\mathbb{R}^{n\times n}$

What is the derivative (with respect to $J$) of the squared norm of the component of $J$ that is orthogonal to $I$ (the identity matrix)?

Attempt

$J$'s projection onto $I$ is $\frac{\langle J,I\rangle_F}{n}I=\frac{Tr(J)}{n}I$

where $\langle A,B\rangle_F=Tr(A^TB)$ denotes the Frobenius inner product (dot product for matrices) and $Tr(A)$ denotes the trace of A.

So the orthogonal component is $J-\frac{Tr(J)}{n}I$

So we seek

$$\frac{\partial}{\partial J}||J-\frac{Tr(J)}{n}||_F^2$$ $$=\frac{\partial}{\partial J}Tr((J-\frac{Tr(J)}{n})^T(J-\frac{Tr(J)}{n}))$$ $$=\frac{\partial}{\partial J}Tr(J^TJ-\frac{Tr(J)}{n}(J^T+J)+\frac{Tr^2(J)}{n^2}I)$$

How to proceed (if correct so far)?

3

There are 3 best solutions below

3
On BEST ANSWER

This question is (relatively) easily answered using the chain rule for the total derivative. Let $f(X) = \|X\|_F^2$, and let $g(X) = X - \frac{\operatorname{tr}(X)}{n}$. We note that $g$ is linear, so its derivative is given by $dg(X)(H) = g(H)$. On the other hand, we have $$ f(X + H) = \operatorname{tr}[(X + H)^T(X + H)] \\ = \operatorname{tr}(X^TX) + 2\operatorname{tr}(X^TH) + \operatorname{tr}(H^TH)\\ = f(X) + 2\operatorname{tr}(X^TH) + o(\|H\|_F^2). $$ Conclude that $dg(X)(H) = 2\operatorname{tr}(X^TH)$.

With the chain rule, we have $$ d[f \circ g](X)(H) = [df(X) \circ dg(X)](H) = df(X)(g(H)) \\ = 2\operatorname{tr}(X^Tg(H)) = 2\operatorname{tr}\left(X^T[H - \frac{\operatorname{tr}(H)}{n}]\right)\\ = 2\operatorname{tr}\left(X^TH\right) - \frac 2n \operatorname{tr}(X)\operatorname{tr}\left(H\right). $$ To convert this to the more conventional format of "denominator layout", we can use the connection between notations explained here to find that $h(J) = (f \circ g)(J)$ satisfies $$ \frac{dh}{dJ} = 2J - \frac 2n \operatorname{tr}(J)I = 2g(J). $$

0
On

In continuum mechanics, they have a name for this,
it's called the Isotropic-Deviatoric decomposition. $$\eqalign{ {\rm iso}(A) &= \left[\frac{{\rm Tr}(A)}{{\rm Tr}(I)}\right]I, \qquad {\rm dev}(A) = A - {\rm iso}(A) \\ }$$ The operations are idempotent and orthogonal $$\eqalign{ {\rm iso}({\rm iso}(A)) &= {\rm iso}(A) \\ {\rm iso}({\rm dev}(A)) &= {\rm dev}({\rm iso}(A)) \;=\; 0 \\ {\rm dev}({\rm dev}(A)) &= {\rm dev}(A) \\ }$$ and behave like the Sym-Skew operators with respect to the inner product $$\eqalign{ A:B &= {\rm Tr}\big(A^TB\big) &\{\rm Frobenius\,product\}\\ 0 &={\rm iso}(A):{\rm dev}(B) \\ A:{\rm iso}(B) &= {\rm iso}(A)\,:{\rm iso}(B) &= {\rm iso}(A):B \\ A:{\rm dev}(B) &= {\rm dev}(A):{\rm dev}(B) &= {\rm dev}(A):B \\ }$$ Write the current problem in terms of these operators.
Then calculate the differential and gradient. $$\eqalign{ X &= {\rm dev}(J) \\ \phi &= X:X \\ d\phi &= 2X:dX \\ &= 2X:{\rm dev}(dJ) \\ &= 2\,{\rm dev}(X):dJ \\ &= 2X:dJ \\ \frac{\partial\phi}{\partial J} &= 2X = 2\,{\rm dev}(J) \\ }$$

0
On

With $S(X) = \|X\|^2 = \langle X, X \rangle$ we have $DS(X) H = 2 \langle X, H\rangle$.

Since $\phi(J)= J - {\operatorname{tr} J \over n} I$ is linear we see that $D \phi(J)H = \phi(H)$.

The chain rule gives $D (S\circ \phi) (J)H = D S(\phi(J)) D \phi(J)H = 2 \langle \phi(J), \phi(H)\rangle$.

Unwinding (& rewinding) gives $D (S\circ \phi) (J)H = \langle 2J - 2{\operatorname{tr} J \over n} I , H \rangle = \langle 2\phi(J), H\rangle$.