Represent the derivative of the following scalar functions with respect to X $\in$ $\Bbb R^{D \times D}$
How can I get derivative of $f(X)=tr(X^2)$?
Represent the derivative of the following scalar functions with respect to X $\in$ $\Bbb R^{D \times D}$
How can I get derivative of $f(X)=tr(X^2)$?
Copyright © 2021 JogjaFile Inc.
When differentiating $\mathrm{tr}X^2$ with respect to $x_{k\ell}$, there are two cases:
As a result, the matrix of partials of $\mathrm{tr}X^2$ is in fact $2X^T$.
More generally you can do these kinds of things with Kronecker deltas, defined broadly by
$$ \delta_{ab}=\begin{cases} 1 & a=b \\ 0 & a\ne b\end{cases} $$
The partial derivatives are $\partial x_{ij}/\partial x_{k\ell}=\delta_{(i,j)(k,\ell)}=\delta_{ik}\delta_{j\ell}$. So we could calculate
$$ \frac{\partial}{\partial x_{k\ell}}\mathrm{tr}X^2=\frac{\partial}{\partial x_{k\ell}}\sum_{i,j}x_{ij}x_{ji}=\sum_{i,j}\left(\frac{\partial x_{ij}}{\partial x_{k\ell}}x_{ji}+x_{ij}\frac{\partial x_{ji}}{\partial x_{k\ell}}\right) $$
by the product rule, then simplify with deltas to
$$ \left(\sum_{i,j}\delta_{ik}\delta_{j\ell}x_{ji}\right)+\left(\sum_{i,j}x_{ij}\delta_{jk}\delta_{i\ell}\right). $$
In these sums, the $\delta$s are $0$ except when the indices $(i,j)$ are exactly right. In the first summation, this means when $(i,j)=(k,\ell)$, and in the second summation this means when $(j,i)=(k,\ell)$. So the two sums in question become $x_{\ell k}+x_{\ell k}$, as expected.
This way involved more tedious work, but is more general and may be what you have to do when there are lots of indices in play in other problems.