Differentiation of matrix with respect to vector

126 Views Asked by At

I have two row vectors $y$ and $k$ of size $1 \times m$ and $1 \times p$ respectively, a matrix $X$ of size $p \times m$. What is the differentiation of: $$(y - kX)^{T} (y - kX)$$ with respect to $k$.

Is it $-2yX^{T} +2k(XX^{T})$?

Also can anyone suggest a good material for understanding matrix differentiation - with respect to vector and matrix.

2

There are 2 best solutions below

0
On BEST ANSWER

Let $w=(kX-y),\,$ then you can write the function, differential, and gradient as $$\eqalign{ f &= w^Tw = {\rm tr}(w^Tw) {\,\dot =\,} w:w \cr df &= 2w:dw = 2w:dk\,X = 2wX^T:dk \cr \frac{\partial f}{\partial k} &= 2wX^T = 2(kX-y)X^T \cr }$$ So it appears that your hypothesis is correct.

0
On

Set $f(k) = (y - kX)^{T} (y - kX).$ Then, if we drop terms not linear in $dk,$ $$\begin{align} f'(k)(dk) & := f(k+dk) - f(k) \\ &= - y^{T}(dk)X - X^{T}(dk)^{T}y + X^{T}(dk)^{T}kX + X^{T}k^{T}(dk)X \\ \end{align}$$

As you can see $dk$ occurs in the middle of the terms. That makes it difficult, to not say impossible, to write the derivative without including $dk$ or some other placeholder.