Derivation of $f(X)=tr(X^2)$

248 Views Asked by At

Represent the derivative of the following scalar functions with respect to X $\in$ $\Bbb R^{D \times D}$

How can I get derivative of $f(X)=tr(X^2)$?

1

There are 1 best solutions below

0
On

When differentiating $\mathrm{tr}X^2$ with respect to $x_{k\ell}$, there are two cases:

  • $x_{k\ell}$ is an off-diagonal entry of $X$ ($k\ne\ell$). Two terms of the summation $\sum_{i,j}x_{ij}x_{ji}$ have nonzero derivative: $x_{k\ell}x_{\ell k}$ and $x_{\ell k}x_{k\ell}$. Both have derivative $x_{\ell k}$, so $(\partial/\partial x_{k\ell})\mathrm{tr}X^2$ is $2x_{\ell k}$.
  • $x_{kk}$ is a diagonal entry of $X$. One term of the summation $\sum_{i,j}x_{ij}x_{ji}$ has nonzero derivative: $x_{kk}x_{kk}$. Its derivative is $2x_{kk}$, so $(\partial/\partial x_{kk})\mathrm{tr}X^2=2x_{kk}$.

As a result, the matrix of partials of $\mathrm{tr}X^2$ is in fact $2X^T$.


More generally you can do these kinds of things with Kronecker deltas, defined broadly by

$$ \delta_{ab}=\begin{cases} 1 & a=b \\ 0 & a\ne b\end{cases} $$

The partial derivatives are $\partial x_{ij}/\partial x_{k\ell}=\delta_{(i,j)(k,\ell)}=\delta_{ik}\delta_{j\ell}$. So we could calculate

$$ \frac{\partial}{\partial x_{k\ell}}\mathrm{tr}X^2=\frac{\partial}{\partial x_{k\ell}}\sum_{i,j}x_{ij}x_{ji}=\sum_{i,j}\left(\frac{\partial x_{ij}}{\partial x_{k\ell}}x_{ji}+x_{ij}\frac{\partial x_{ji}}{\partial x_{k\ell}}\right) $$

by the product rule, then simplify with deltas to

$$ \left(\sum_{i,j}\delta_{ik}\delta_{j\ell}x_{ji}\right)+\left(\sum_{i,j}x_{ij}\delta_{jk}\delta_{i\ell}\right). $$

In these sums, the $\delta$s are $0$ except when the indices $(i,j)$ are exactly right. In the first summation, this means when $(i,j)=(k,\ell)$, and in the second summation this means when $(j,i)=(k,\ell)$. So the two sums in question become $x_{\ell k}+x_{\ell k}$, as expected.

This way involved more tedious work, but is more general and may be what you have to do when there are lots of indices in play in other problems.