I would like to differentiate the following with respect to psi (partial):
$$ \operatorname{trace}\bigl((X^\top X)^{-\psi} P\bigr). $$
Here we have that: $ X \in \mathbb{R}^{p \times n}, P \in \mathbb{R}^{n \times n}$ where $n,p \geq 0$ and $\psi \in [0,1] \subset \mathbb{R}$ is a scalar.
I have no idea how to start doing this as there are matrices involved…
Thank you!
That follows is a simple consequence of Achille's post. Let $f:\psi\rightarrow trace((X^TX)^{-\psi}P)$.
Necessarily $n\leq p$ and $rank(X)=n$. Since $X^TX$ is PDS, $X^TX=Qdiag(\lambda_i)Q^T$ where $\lambda_i>0$ and $Q$ is orthogonal ; one has $(X^TX)^{-\psi}=Qdiag({\lambda_i}^{-\psi})Q^T$ and $\log(X^TX)=Qdiag(\log(\lambda_i))Q^T$.
Finally $f'(\psi)=-trace(log(X^TX)(X^TX)^{-\psi}P)=-trace(diag(\log(\lambda_i){\lambda_i}^{-\psi})R)$ where $R=[R_{i,j}]=Q^TPQ$. Thus $f'(\psi)=-\sum_i \log(\lambda_i){\lambda_i}^{-\psi}R_{i,i}.$