I encounter a problem where I wish to calculate: $$ \frac{\partial}{\partial\boldsymbol{X}}\,\operatorname{tr}\left(\left( \boldsymbol{X X}^\top \right) ^{\frac{1}{2}}\right) $$ Peterson gave a very thorough discussion on different types of matrix differentiation, including ones involving quadratic trace. Nevertheless, I am at a loss when I have fraction power. I tried as follows: $$ \begin{align} \dfrac{\partial}{\partial\boldsymbol{X}}\,\operatorname{tr}\left( \boldsymbol{X X} ^\top\right) ^{\frac{1}{2}} &= \left \{ \dfrac{\partial}{\partial\boldsymbol{X}^{1/2}}\,\operatorname{tr}\left( \boldsymbol{X X} ^\top \right)^{1/2} \right\}^\top\dfrac{\partial\boldsymbol{X}^{1/2}}{\partial\boldsymbol{X}} \end{align} $$ Yet, I found it seems that the chain rule cannot be applied this way as $\boldsymbol{X}^{1/2}$ may not exist if $\boldsymbol{X}$ is not square.
Thanks in advance.
The function $$N = {\rm tr}\Big(\sqrt{XX^T}\Big)$$ is known as the Nuclear norm of $X^T$.
The gradient is given by either $$\eqalign{ \frac{\partial N}{\partial X} &= (XX^T)^{-1/2}\,X \cr &= X\,(X^TX)^{-1/2} \cr \cr }$$ If the SVD of $X$ is available, then $$\eqalign{ X &= USV^T \cr \frac{\partial N}{\partial X} &= UV^T \cr }$$