The derivative and extremum of a matrix function

165 Views Asked by At

$$f(W)=(Ax-b)^TW(Ax-b)=x^TA^TWAx-2b^TWAx+b^TWb$$

where $f(W)$ is a function of $W$, $A$ is a known matrix, $x$ and $b$ are vectors ($b$ is known). How to get $\frac{\partial f}{\partial W}$?

2

There are 2 best solutions below

2
On BEST ANSWER

Define the vector $$y=Ax-b$$ and write the function in terms of this new variable and the double-dot (aka Frobenius) product.

In this form, the differential & gradient are easy to calculate $$\eqalign{ f &= yy^T:W \cr df &= yy^T:dW \cr \frac{\partial f}{\partial W} &= yy^T \cr }$$

0
On

$$f (\mathrm W) := (\mathrm A \mathrm x - \mathrm b)^{\top} \mathrm W (\mathrm A \mathrm x - \mathrm b) = \mbox{tr} \left( (\mathrm A \mathrm x - \mathrm b)(\mathrm A \mathrm x - \mathrm b)^{\top} \mathrm W \right) = \langle (\mathrm A \mathrm x - \mathrm b)(\mathrm A \mathrm x - \mathrm b)^{\top}, \mathrm W \rangle$$

Hence,

$$f ' (\mathrm W) = \color{blue}{(\mathrm A \mathrm x - \mathrm b)(\mathrm A \mathrm x - \mathrm b)^{\top}}$$