How to compute derivative with respect to a matrix?

105 Views Asked by At

If I have a function $f = v^TAv$, I can compute its derivative w.r.t. $v$ as follows

$$ \begin{align} f &= \sum_i \sum_j A_{ij}v_iv_j \\ \frac{\delta f}{\delta v_k} &= \sum_j A_{kj}v_j + \sum_i A_{ik}v_i \\ &= A_{k,:}v + vA_{:,k} \\ \frac{\delta f}{\delta v} &= (A + A^T)v. \end{align}$$

This is how I learnt how to compute the derivative wrt a vector. How can I compute derivative wrt a matrix ?

More specifically, what is $\dfrac{\delta f}{\delta A_{nm}}$? And how do I convert the result to a matrix?

1

There are 1 best solutions below

1
On

Differentiation is the finding of a linear approximation to the function at a particular point, denoted by $Df$. (See page 16 of Spivak, Calculus on Manifolds, for example). When $f$ is already linear in the variable, $Df=f$.

Now, $f$ is non-linear in $v$ and you computed the linear function $D_vf$ that best approximates $f(v)$.

But as a function of $A$, $f$ is already linear. (Since $f(aA+bB)=af(A)+bf(B)$). So the derivative $D_Af$ is $f$ itself.

However, you cannot write down an explicit expression for $\frac{\partial f}{\partial A}$, (a different thing from $D_Af$)

Note: the way you have written $\frac{\partial f}{\partial v}$ is not strictly correct. It should be $v^T(A+A^T)$. That is, it should be a row vector if $v$ is a column vector.