Suppose we have a matrix $X(a)\in\mathbb{R}^{N\times K}$ where $N>K$ and $\left[X^TX \right]^{-1}$ is non-singular. The parameter $a$ is a scalar $a\in{\mathbb{R}}$. A vector $y\in\mathbb{R}^N$ is not a function of $a$. What is:
$$ \frac{d}{da}X(a)\left[X(a)^TX(a)\right]^{-1}X(a)^Ty $$
Other info:
- It would be most useful for me if the answer is expressed in terms of $X(a)'$, the element-wise derivative of $X$ (along with $X(a)$ and $y$)
- Highly related to derivative of a projection matrix. Could be I'm just not experienced at mixing calculus and matrices, but I'm not seeing the solution here as directly applicable to my problem.
- Not particularly interested in edge cases. You may assume everything is well-conditioned, the derivative exists, etc.
Write $a$-derivatives with primes. As $a$ is a scalar, properties such as the product rule are convenient. First note how to differentiate a matrix inverse:$$MN=I\implies O=(MN)^\prime=MN^\prime+M^\prime N\implies N^\prime=-M^{-1}M^\prime N,$$i.e. $(M^{-1})^\prime=-M^{-1}M^\prime M^{-1}$. So$$(X^TX)^{-1\prime}=-(X^TX)^{-1}(X^TX)^\prime(X^TX)^{-1}=-(X^TX)^{-1}(X^{\prime T}X+X^TX^\prime)(X^TX)^{-1}.$$Hence$$\begin{align}(X(X^TX)^{-1}X^Ty)^\prime&=X^\prime(X^TX)^{-1}X^Ty\\&-X(X^TX)^{-1}(X^{\prime T}X+X^TX^\prime)(X^TX)^{-1}X^Ty\\&+X(X^TX)^{-1}X^{\prime T}y\\&+X(X^TX)^{-1}X^Ty^\prime.\end{align}$$Edit: as @Matterhorn notes, the problem as stated satisfies $y^\prime=0$, allowing us to delete the last term.