Hessian of Frobenius norm

1.8k Views Asked by At

I want to find the Hessian of the following function, $$ F(\mathbf{X}) = \frac{1}{2}\Vert \mathbf{Y} - \mathbf{AX} \Vert_{\text{F}}^2 $$


My try: Using trace formula for Frobenius norm, $F(\mathbf{X})$ can be written as, $$ F(\mathbf{X}) = \frac{1}{2}\operatorname{tr}[(\mathbf{Y} - \mathbf{AX})(\mathbf{Y} - \mathbf{AX})^T] = \frac{1}{2}\left(\operatorname{tr}(\mathbf{YY}^T) - \operatorname{tr}(\mathbf{YX}^T\mathbf{A}^T) - \operatorname{tr}(\mathbf{AXY}^T) + \operatorname{tr}(\mathbf{AXX}^T\mathbf{A}^T) \right) $$

So, $$ \nabla F(\mathbf{X}) = \frac{1}{2}\left( 0 - \mathbf{A}^T\mathbf{Y} - \mathbf{A}^T\mathbf{Y} + \mathbf{A}^T\mathbf{AX} + \mathbf{A}^T\mathbf{AX}\right) = - \mathbf{A}^T\mathbf{Y} + \mathbf{A}^T\mathbf{AX} $$ I used formula 102, 101 and 109 of Matrix Cookbook. Now, $$ \nabla^2 F(\mathbf{X}) = 0 + \frac{d\mathbf{A}^T\mathbf{AX}}{d\mathbf{X}} = \mathbf{A}^T\mathbf{A} $$ I'm not, particularly, sure about the last step whether I can write this. Please help me to get out of this confusion.

A related question was asked in Derivative of a Matrix w.r.t. a Matrix.

1

There are 1 best solutions below

1
On

We look for the linear and the bilinear terms in the multivariate Taylor approximation at $x$ (in subscript): $$f(x+u) \approx f_x + Df_x\,(u) + \tfrac12 Hessf_x\,(u,u).$$ Let $f(x)=\|Ax\|^2$, then $$ \langle A(x+h),A(x+h)\rangle=\|Ax\|^2+2\langle Ax,h\rangle+\langle Ah,Ah\rangle $$ The Hessian term can be written as $\langle Ah,Ah\rangle=\tfrac12 \langle h,2A^TAh\rangle$, thus the Hessian is $H=2A^TA$, a constant matrix. The above works for both L2- and Frobenius norms.