How to calculate hessian of <Ax, x> / <x, x>

171 Views Asked by At

I need to calculate the gradient and hessian of a function $\frac{\langle Ax, x \rangle}{\langle x, x \rangle}$ where $A$ is a symmetric real matrix. I am a bit familiar with matrix derivatives, so I hopefully calculated the differential correctly $$ dR = \left \langle \frac{2Ax}{\langle x, x \rangle} - \frac{2x\langle Ax, x \rangle}{\langle x, x \rangle^2}, dx \right \rangle = \langle \nabla R(x), dx \rangle $$ But I am completely lost with hessian. I don't understand how I can derive it and move $d_2$ and $d_1$ on correct places and preserve the correct sizes of factors. How can I do that?

1

There are 1 best solutions below

0
On

$ \def\c#1{\color{red}{#1}} \def\a{\alpha}\def\b{\beta}\def\l{\lambda} \def\o{{\tt1}}\def\p{\partial} \def\L{\left}\def\R{\right} \def\LR#1{\L(#1\R)} \def\RR#1{\c{\LR{#1}}} \def\BR#1{\Big(#1\Big)} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\hess#1#2#3{\frac{\p^2 #1}{\p #2\,\p #3}} $Let's use a colon to denote the trace/Frobenius product, i.e. $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{AB^T} \\ A:A &= \big\|A\big\|^2_F \\ }$$ When applied to vectors $(n=\o)$ it reduces to the standard dot product.

Then consider the following scalar functions and their differentials. $$\eqalign{ \a &= A:xx^T &\qiq &d\a=2Ax:dx,\quad&A^T=A \\ \b &= B:xx^T &\qiq &d\b=2Bx:dx,\quad&B^T=B \\ }$$ Use these to rewrite your objective function, then calculate its differential and gradient. $$\eqalign{ \l &= \b^{-1}\a \\ d\l &= \b^{-2}\LR{\b\,d\a-\a\,d\b} \\ &= \b^{-1}\LR{2Ax-2\l Bx}:dx \\ \grad{\l}{x} &= 2\b^{-1}\LR{A-\l B}x \;\;\doteq\;\; g \\ }$$ Now calculate the differential and gradient of the gradient. $$\eqalign{ dg &= 2\b^{-1}\LR{A-\l B}\RR{dx} - 2\b^{-1}Bx\RR{d\l} + 2\LR{A-\l B}x\RR{d\b^{-1}} \\ &= 2\b^{-1}\LR{A-\l B}dx -2\b^{-1}\LR{Bx}g^Tdx - \LR{\b g}\,\b^{-2}\LR{2Bx}^Tdx \\ &= 2\b^{-1}\BR{A-\l B -{Bx}g^T -gx^TB}\,dx \\ \grad{g}{x} &= 2\b^{-1}\BR{A-\l B -{Bx}g^T -gx^TB} \;\;\doteq\;\; H \\ }$$ So that's the Hessian. Obviously you are interested in the case where $B=I$.

To reduce some of the expressions above, the following relationships were utilized $$\eqalign{ &d\l = g:dx \;=\; g^Tdx \\ &d\b^{-1} = -\b^{-2}d\b \;\;=\ -\b^{-2}\LR{2Bx}^Tdx \\ }$$