Let $X$ be a squared matrix,
We know that $\frac {\partial tr(X^TX)}{\partial X}$ is $2X$
But how about the case of $\frac {\partial tr((X^TX)^2)}{\partial X}$ or even $\frac {\partial tr((X^TX)^p)}{\partial X}$
Is there any generalization?
Note that here $(X^TX)^2 = X^TXX^TX$ and similar case applies to $(X^TX)^p$
When $p=2$, \begin{align*} &\left[(X+\Delta X)^\top(X+\Delta X)\right]^2 - (X^\top X)^2\\ =&\left[(X+\color{red}{\Delta X})^\top(X+\color{green}{\Delta X})(X+\color{blue}{\Delta X})^\top(X+\color{orange}{\Delta X})\right] - (X^\top X)^2\\ =&\color{red}{\Delta X}^\top X(X^\top X) +X^\top \color{green}{\Delta X} (X^\top X) +(X^\top X)\color{blue}{\Delta X}^\top X +(X^\top X)X^\top \color{orange}{\Delta X}+O(\|\Delta X\|^2). \end{align*} Therefore, using the properties $\newcommand{\tr}{\operatorname{tr}}\tr(AB)=\tr(BA)$ and $\tr(A^\top)=\tr(A)$, we get $$ \tr\left\{\left[(X+\Delta X)^\top(X+\Delta X)\right]^2 - (X^\top X)^2\right\} = 4\tr\left(\Delta X^\top X(X^\top X)\right) +O(\|\Delta X\|^2). $$ and hence $\dfrac{\partial \tr(X^TX)}{\partial X} = 4 X(X^\top X)$. By a similar argument, one can deduce that $\dfrac{\partial \tr\left((X^TX)^p\right)}{\partial X} = 2p X(X^\top X)^{p-1}$.