Derivative of a product respect a vector.

69 Views Asked by At

Studying Portfolio Theory I have had a doubt about a derivative of a function that appears in the book I'm following.

If I have the function,

$$\mathcal{L}(\mathbf{w},\lambda)=\frac{1}{2}\mathbf{w}^T\mathbf{\Sigma}\mathbf{w}+\lambda\mathbf{w}\mathbf{\mu}$$

where, $\mathbf{w},\mathbf{\mu}\in\mathbb{R}^n$, $\lambda\in\mathbb{R}$ and $\mathbf{\Sigma}$ is a n-square symmetric matrix (with real entries). Then the partial derivative (according with the book) is,

$$\frac{\partial\mathcal{L}}{\partial\mathbf{w}}=\mathbf{\Sigma}\mathbf{w}+\lambda\mathbf{\mu}$$ I understand the second term ($\lambda\mathbf{\mu}$), but I don't know why $\frac{\partial\mathcal{L}}{\partial\mathbf{w}}=\mathbf{\Sigma}\mathbf{w}$. I have searched about derivatives of vectors but maybe I'm searching in the wrong place, because I haven't found anything useful.


Thanks in advance. I just want to understand why is that equality true and if there's a formal justification (or a definition that I don't know).

1

There are 1 best solutions below

1
On BEST ANSWER

Consider a scalar function of a matrix and two vectors $$\phi = a^TMb$$ Since the transpose operation does not affect scalar values we can write this in a number of different forms $$\eqalign{ \phi &= (a^TMb)^T = b^TM^Ta \cr &= (Mb)^Ta = (M^Ta)^Tb \cr }$$ Let's find the gradient with respect to each vector. Start by finding the differential $$\eqalign{\phi &= (Mb)^T\,da + (M^Ta)^T\,db \cr}$$ Holding $b$ constant makes $db=0$ and we obtain $$\eqalign{ d\phi &= (Mb)^T\,da \cr \frac{\partial\phi}{\partial a} &= Mb \cr }$$ Similarly, holding $a$ constant yields $$\eqalign{ d\phi &= (M^Ta)^T\,db \cr \frac{\partial\phi}{\partial b} &= M^Ta \cr }$$ If $(a=b=w)$ then there is only one vector, but we have to treat each occurrence independently. Using the above results we have $$\eqalign{ \frac{\partial\phi}{\partial w} &= Mw + M^Tw \cr }$$ And if $M^T=M$ then we can simplify this to $$\eqalign{ \frac{\partial\phi}{\partial w} &= 2Mw \cr }$$