I have a mapping $u:\mathbb{R}^{n\times n}\to \mathbb{R}$ of the form $$u(\mathbf{X}) = \langle \mathbf{X}, \mathbf{X}^T\mathbf{C}\rangle = \mathrm{tr}(\mathbf{C}^T\mathbf{X}^2),$$ where $\mathbf{C} \in \mathbb{R}^{n\times n}$ is non-symmetric and the angular brackets denote the matrix dot product, i.e. the sum of entries in an element-wise (or Hadamard) product. Its gradient is $$\mathbf{\nabla} u(\mathbf{X}) = \frac{\partial u}{\partial \mathbf{X}} = \mathbf{X}^T\mathbf{C} + \mathbf{C}\mathbf{X}^T,$$ and if my math is correct, the Hessian can be represented as $$\mathbf{H} = \frac{\partial \mathbf{\nabla}u}{\partial \mathbf{X}} = \mathbf{C}\otimes \mathbf{I} + \mathbf{I}\otimes \mathbf{C}^T$$ since $\frac{\partial(\mathbf{X}\mathbf{C}+\mathbf{C}\mathbf{X})}{\partial \mathbf{X}} = \mathbf{C}^T\otimes \mathbf{I} + \mathbf{I}\otimes \mathbf{C}$ and $\frac{\partial \mathbf{f}}{\partial \mathbf{X}^T} = \left(\frac{\partial \mathbf{f}}{\partial \mathbf{X}}\right)^T$. However, $$\mathbf{H}^T = \mathbf{C}^T\otimes \mathbf{I} + \mathbf{I}\otimes \mathbf{C} \neq \mathbf{H}.$$ What am I missing?
2026-03-29 08:13:46.1774772026
On
Constant Hessian is not symmetric?
100 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
2
There are 2 best solutions below
0
On
By reorganizing your function as follows $$\phi = \mathbf{X}:\mathbf{X}^T\mathbf{C} = \mathbf{x}:(\mathbf{C}^T\otimes \mathbf{I}) \operatorname{vec} \left( \mathbf{X}^T \right) = \mathbf{x}:(\mathbf{C}^T\otimes \mathbf{I}) \mathbf{K}\mathbf{x} $$ with the commutation matrix $\mathbf{K}$ and the notation $\mathbf{x} =\operatorname{vec} \left( \mathbf{X}\right)$.
The Hessian is now straightforward $\mathbf{H}=\mathbf{A}+\mathbf{A}^T$ with $\mathbf{A} =(\mathbf{C}^T\otimes \mathbf{I}) \mathbf{K}$ which is clearly symmetric.
You calculations are a bit inaccurate. Since $u(X)=\operatorname{tr}(C^TX^2)$, for a small perturbation $P$, the first order term for $u(X+P)$ in $P$ is $\operatorname{tr}(C^TXP+C^TPX)=\operatorname{tr}((C^TX+XC^T)P)=\langle\operatorname{vec}(P),\operatorname{vec}(X^TC+CX^T)\rangle$. Therefore $$ \nabla u(X)=\operatorname{vec}(X^TC+CX^T)=(C^T\otimes I+I\otimes C)K\operatorname{vec}(X)\in\mathbb R^{n^2}, $$ where $K$ denotes the commutation matrix such that $\operatorname{vec}(X^T)=K\operatorname{vec}(X)$ for every $X\in\mathbb R^{n\times n}$. (By the way, the gradient of a real-valued function is a coordinate vector, not a square matrix.)
Since $v\mapsto (C^T\otimes I+I\otimes C)Kv$ is linear, the Hessian of $u$ is just $$ H=(C^T\otimes I+I\otimes C)K. $$ This matrix is symmetric: as $K=K^T$, it is straightforward to verify that $$ H\operatorname{vec}(X) =\operatorname{vec}(X^TC+CX^T) =H^T\operatorname{vec}(X) $$ for every matrix $X$.