Differentiating a matrix function with respect to a scalar

107 Views Asked by At

I would like to differentiate the following with respect to psi (partial):

$$ \operatorname{trace}\bigl((X^\top X)^{-\psi} P\bigr). $$

Here we have that: $ X \in \mathbb{R}^{p \times n}, P \in \mathbb{R}^{n \times n}$ where $n,p \geq 0$ and $\psi \in [0,1] \subset \mathbb{R}$ is a scalar.

I have no idea how to start doing this as there are matrices involved…

Thank you!

1

There are 1 best solutions below

0
On

That follows is a simple consequence of Achille's post. Let $f:\psi\rightarrow trace((X^TX)^{-\psi}P)$.

Necessarily $n\leq p$ and $rank(X)=n$. Since $X^TX$ is PDS, $X^TX=Qdiag(\lambda_i)Q^T$ where $\lambda_i>0$ and $Q$ is orthogonal ; one has $(X^TX)^{-\psi}=Qdiag({\lambda_i}^{-\psi})Q^T$ and $\log(X^TX)=Qdiag(\log(\lambda_i))Q^T$.

Finally $f'(\psi)=-trace(log(X^TX)(X^TX)^{-\psi}P)=-trace(diag(\log(\lambda_i){\lambda_i}^{-\psi})R)$ where $R=[R_{i,j}]=Q^TPQ$. Thus $f'(\psi)=-\sum_i \log(\lambda_i){\lambda_i}^{-\psi}R_{i,i}.$