Hessian Matrix of Matrix Product

141 Views Asked by At

I am not sure how I can compute the Hessian Matrix of a trace of the matrix such as this:

$$ f(w) := \operatorname{tr} \left( B w w^T A \right) $$

where $A$ and $B$ are $n \times n$ square matrices and $w$ is an $n$-dimensional column vector. I guess the trace will cancel out and I know the Hessian Matrix is the second-order gradient of the function in a matrix but I am confused.

I calculated the gradient of this function to be $\nabla f(w) = (AB + A^{T}B^{T})w$ but am not sure how to go on from there.

2

There are 2 best solutions below

0
On

Something has gone wrong with your calculation of the gradient.First of all, the gradient is usually defined to be a row vector so we'll assume you meant the gradient to be $w^TC$ or maybe $w^TC^T$ where $C$ is your matrix $AB+A^TB^T$.But if the gradient of is $w^TM$ for some matrix $M$ then the Hessian is $M^T$.But the Hessian must be symmetric so $M^T$ and thus $M$ must be symmetric. There is no reason for your matrix $C$ or $C^T$ to be symmetric.

0
On

Note first that your function may be written simply using trace property as $f(\mathbf{w}) = \operatorname{tr} \left( \mathbf{B} \mathbf{w} \mathbf{w}^T \mathbf{A} \right) = \mathbf{w}^T \mathbf{AB} \mathbf{w} $.

Gradient is $\mathbf{g} = 2 \mathrm{sym}(\mathbf{AB}) \mathbf{w} $ where $\mathrm{sym}(\mathbf{X}) = \frac12 (\mathbf{X}+\mathbf{X}^T)$.

Hessian is straightforward $\mathbf{H} = 2 \mathrm{sym}(\mathbf{AB}) $.