I have to calculate the gradient of:
$ F(x) = 1/2 * x^T * H *x$
where H is a constant symmetric n x n matrix and x an nx1 vector . My question: Do i have to multiply out the expression to get a scalar, or do i need to use a chain-rule? If i multiply it out (I did it with an easier 3x3 matrix) - the result is $grad(f(x) = H$ which should be correct.
I need to get used to this vector/matrix derivation - but I cant always multiply things out (by easier examples). So: Is there a better way to do such things :)? Thanks for your help
You have two ways to tackle the problem:
On my side, I prefer to use the second technique as (1) it allows to avoid the confusion with many indexes and (2) can be applied in any vector spaces.
I follow on with the second technique from there.
The target is to use the chain rule with the appropriate maps. Consider the bilinear map $$\begin{array}{l|rcl} f : & V \times V & \longrightarrow & V \\ & (u,v) & \longmapsto & u^T * H *v \end{array}$$where $V = \mathbb R^n$. The derivative of $B$ at point $(u,v)$ is the map $$B^\prime(u,v).(h,k)=h^T * H * v + u^T * H * k.$$
Now the important point is to notice that $$F(x)=\frac{1}{2}B(x,x).$$ Hence applying the chain rule, you get $$F^\prime(x).h=\frac{1}{2}h^T * H *x + \frac{1}{2}x^T * H *h.$$ Which is equal to $$\color{red}{F^\prime(x).h=x^T*H* h}$$ as you suppose $H$ symmetric. Here $F^\prime(x)=\nabla F(x)=H*x$ is therefore the gradient.
From there, you can retrieve the partial derivatives. Note $(e_1, \dots, e_n)$ the canonical basis of $\mathbb R^n$. You have $$\color{red}{\frac{\partial F}{\partial x_i}(x)=F^\prime(x).e_i=\nabla(x)*e_i=x^T*(H*e_i)=\sum_{i=j}^n x_j H_{ji}}$$