How to compute the derivative of this matrix equation

80 Views Asked by At

The matrix $\mathbf{A}(c)$ with the dimension $M \times N$, $c$ is a scalar variable. The matrix $\mathbf{d}$ is a constant matrix with the dimension $M \times 1$. If the formula $\frac{d\mathbf{A}(c)}{dc}$ has been known, I need to compute the derivative of $f(c)$ $$f(c) = \mathbf{A}(c)\Big( \mathbf{A}^T(c)\mathbf{A}(c) \Big)^{-1}\mathbf{A}^T(c)\mathbf{d}$$

Is there an expression $\frac{df}{dc}$ by $\frac{d\mathbf{A}(c)}{dc}$

2

There are 2 best solutions below

1
On BEST ANSWER

From the Matrix Cookbook we see that the derivative of an inverse is $$\frac{\partial Y^{-1}}{\partial c} = -Y^{-1}\frac{\partial Y}{\partial c}Y^{-1}$$ We can use that to get $$\frac{d(A^TA)^{-1}}{dc} = -(A^TA)^{-1}(\frac{d(A^TA)}{dc})(A^TA)^{-1}$$ but we can just use the product rule on that to get: $$\frac{d(A^TA)^{-1}}{dc} = -(A^TA)^{-1}(\frac{dA^T}{dc}A + A^T\frac{dA}{dc})(A^TA)^{-1}$$ Now we want to use the product rule again on the whole sequence to get: \begin{align*} f'(c) &= \frac{dA}{dc}(A^TA)^{-1}A^Td + A\frac{d(A^TA)^{-1}}{dc}A^Td + A(A^TA)^{-1}\frac{dA}{dc}d + 0 \end{align*} Where the last term is zero because $d$ is a constant. Now we can plug in the earlier term and we get what I believe to be the most simplified form possible:

\begin{align*} f'(c) &= \frac{dA}{dc}(A^TA)^{-1}A^Td - A(A^TA)^{-1}(\frac{dA^T}{dc}A + A^T\frac{dA}{dc})(A^TA)^{-1}A^Td + A(A^TA)^{-1}\frac{dA}{dc}d \end{align*} Since A just depends on c, let's replace the derivatives with just primes and we get something that looks prettier: \begin{align*} f'(c) &= A'(A^TA)^{-1}A^Td - A(A^TA)^{-1}({A'}^TA + A^TA')(A^TA)^{-1}A^Td + A(A^TA)^{-1}A'd \end{align*}

0
On

This is not a complete derivation, but if you take PhysicsKid's answer and define $$\eqalign{ \def\sym#1{\operatorname{sym}\left(#1\right)} &\sym{X} = \tfrac12\left(X+X^T\right) \qquad&\big({\rm symmetric\;part\;of\;}X\big) \\ &A^+ = \left(A^TA\right)^{-1}A^T \qquad&\big({\rm pseudo\;inverse\;of\;}A\big) \\ &\dot A = \frac{dA}{dc} \qquad&\big({\rm dot\;notation\;for\;derivatives}\big) \\ &H = AA^+ = H^T \qquad&\big({\rm Hat\;matrix\;of\;}A\big) \\ &\dot H = 2\,\sym{(I-H)\dot AA^+} \qquad&\big(\ldots{\rm its\;derivative}\big) \\ &f = Hd \qquad&\big({\rm your\;function}\big) \\ }$$ Then the desired derivative can be written more concisely as $$\eqalign{ {\dot f=\dot Hd} \qquad\qquad\qquad\qquad\qquad \qquad\qquad\qquad\qquad\qquad \qquad\quad \\ }$$ NB:$\,$ You can find derivations of the formula for $\dot H$ in some statistics textbooks such as
Harville's$\,$ Matrix Algebra from a Statisticians's Perspective $\,$but they are not as direct as PhysicsKid's answer.