Gradient and Hessian of vector multiplication

952 Views Asked by At

I was asked to find gradient ($\nabla f(x)$) and Hessian ($H(f(x))$) of $f(x)=(a^T x)\cdot (b^T x)$, where $x$, $a$, and $b$ are n-dimensional column vectors.

I was not taught how to find them with respect to vectors of general form.

Here is what I've done so far using intuition from calculus and linear algebra, but I think there are some issues.

$f(x)=(a^T\cdot x)\cdot (b^T\cdot x)=\sum_{i=1}^{n}a_{i} x_{i}\cdot \sum_{i=1}^{n}b_{i} x_{i}$.

$\nabla f(x) = \sum_{i=1}^{n}a_{i}\cdot \sum_{i=1}^{n}b_{i} x_{i} + \sum_{i=1}^{n}b_{i}\cdot \sum_{i=1}^{n}a_{i} x_{i}$, using the product rule.

$H=\frac{\mathrm{d \nabla f(x)}}{\mathrm{d} x} = 2\cdot \sum_{i=1}^{n}a_i{}\cdot \sum_{i=1}^{n}b_i{}$.

Could you say if I'm wrong and give some intuition how to convert these into short vector form again. Any book or reference link with examples like this are appreciated.

1

There are 1 best solutions below

0
On BEST ANSWER

Let's use a colon as a convenient product notation for the trace, i.e. $$A:B = {\rm Tr}(A^TB) = {\rm Tr}(B^TA) = B:A$$ and define the matrix $$M=ab^T$$ Write the function in terms of this matrix. Then calculate its gradient. $$\eqalign{ f &= {\rm Tr}(a^Tx\;x^Tb) \\ &= {\rm Tr}(ba^Txx^T) \\ &= M:xx^T \\ df &= M:(dx\,x^T+x\,dx^T) \\ &= (M+M^T):dx\,x^T \\ &= (M+M^T)\,x:dx \\ \frac{\partial f}{\partial x} &= (M+M^T)\,x \;=\; g \qquad\big({\rm gradient\,vector}\big) \\ }$$ Now calculate the gradient of the gradient. $$\eqalign{ dg &= (M+M^T)\,dx \\ \frac{\partial g}{\partial x} &= (M+M^T) \;=\; H\qquad\big({\rm Hessian\,matrix}\big) \\ }$$