Derivative of a matrix product with respect to scalar

180 Views Asked by At

How to take derivative of $$f\left(\lambda\right)=\mathbf{X}\left(\mathbf{X}^{T}\mathbf{X}+\lambda\mathbf{I}\right)^{-1}\mathbf{X}^{T}\mathbf{y}$$ with respect to scalar variable $\lambda$, where $\mathbf{X}$ is a matrix and $\mathbf{y}$ is a vector of constants?

I have tried some matrix calculus identities, but could not find a solution.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $A:=X^TX+\lambda I$ so $\partial_\lambda A=I$ and$$I=AA^{-1}\implies O=A\partial_\lambda A^{-1}+A^{-1}\implies\partial_\lambda A^{-1}=-A^{-2}.$$Hence $\partial_\lambda f=-X(X^TX+\lambda I)^{-2}X^Ty$.