derivative with respect to a vector/matrix

514 Views Asked by At

please excuse the stupid question but I cant find anything online..

If $$f(\vec{x}) = \vec{x}^TA\vec{x}$$ with $A$ being a matrix, then $$ \frac{df}{d\vec{x}} = \vec{x}^T(A+A^T)$$ Can someone tell me why this is? And I am also interested in knowing what the derivatives of the following termes are: $$ \frac{d}{d\vec{x}}\vec{x}^T A, \qquad \frac{d}{d\vec{x}}A \vec{x}$$ as well as the derivatives with respect to a matrix H $$ \frac{d}{dH}H A , \qquad \frac{d}{dH}A H^T$$

Many thanks for your help.

2

There are 2 best solutions below

4
On BEST ANSWER

So you have a function $f \colon \mathbb{R}^n \to \mathbb{R}$, given by $f(x) = x^T Ax$. If you know the concept of derivative in $\mathbb{R}^n$, you just compute $$ f(x+h) -f(x) = (x+h)^TA(x+h) -x^TAx= x^TAh + h^TAx + O(|h|^2), $$ where $O(|h|^2)$ is a term that behaves like $|h|^2$ as $h \to 0$. Now you simply use the rules of transposition to conclude that $$ f(x+h) -f(x) = (x+h)^TA(x+h) -x^TAx= x^T(A+A^T)h + O(|h|^2), $$ and therefore $$ \lim_{|h| \to 0} \frac{f(x+h) -f(x) - x^T(A+A^T)h}{|h|}=0, $$ and, by definition, $Df(x)=x^T(A+A^T)$.

For the other functions, first of all check if they are linear (in $x$, or in $H$), because the derivative of a (continuous) linear function coincides with the function itself.

6
On

Well you have to write the things...

$$f(x) = x^TAx = \sum_{i = 1}^n \sum_{j = 1}^n A_{i,j} x_i x_j$$ so using the classic definition of partial derivatives you get $$\frac{\partial}{\partial x_k}f(x) = \sum_{j = 1}^n A_{k,j}x_j+ \sum_{i = 1}^n A_{i,k} x_i = (A^Tx)_k+(Ax)_k=((A+A^T)x)_k,$$ using this shows $$\nabla f(x) = (A+A^T)x.$$

Note that you can see a matrix $H \in \mathbb{R}^{m \times n}$ as a vector $h \in \mathbb{R}^{nm}$ and then you just "reshape" the function to make the dimensions consistent.