What is the Hessian matrix of $f\left(x\right)=\left\langle Ax,x\right\rangle \cdot\left\langle Bx,x\right\rangle $?

179 Views Asked by At

I'm trying to understand what is the Hessian matrix of $f\colon\mathbb{R}^{n}\to\mathbb{R}$ defined by $f\left(x\right)=\left\langle Ax,x\right\rangle \cdot\left\langle Bx,x\right\rangle $ where $A,B$ are symetric $n\times n$ matrices. What I know is that if we let $g\left(x\right)=\left\langle Ax,x\right\rangle $ and $h\left(x\right)=\left\langle Bx,x\right\rangle $ then $\nabla g\left(x\right)=2Ax,\nabla h\left(x\right)=2Bx$ and $\nabla^{2}g\left(x\right)=2A,\nabla^{2}h\left(x\right)=2B$. Also by the product rule we have $\left(fg\right)'=f'g+fg'$ which then gives us \begin{align*} \left(fg\right)'' & =f''g+f'g'+f'g'+fg''=\\ & =f''g+2f'g'+fg'' \end{align*} Regarding $\nabla f\left(x\right)$ as a column vector, I tried to implement this on the given $f\left(x\right)$ and what I got is $$ \nabla f\left(x\right)=\nabla\left(gh\right)\left(x\right)=2Ax\cdot\left\langle Bx,x\right\rangle +\left\langle Ax,x\right\rangle \cdot2Bx $$ which seems to have worked fine with a concrete example. But then I got to the Hessian: \begin{align*} \nabla^{2}f\left(x\right) & =\nabla^{2}\left(gh\right)\left(x\right)=2A\cdot\left\langle Bx,x\right\rangle +\underset{{\scriptscriptstyle \left(\ast\right)}}{\underbrace{2Ax\cdot2Bx}}+\underset{{\scriptscriptstyle \left(\ast\right)}}{\underbrace{2Ax\cdot2Bx}}+\left\langle Ax,x\right\rangle \cdot2B=\\ & =2A\cdot\left\langle Bx,x\right\rangle +\underset{{\scriptscriptstyle \left(\ast\right)}}{\underbrace{8Ax\cdot Bx}}+\left\langle Ax,x\right\rangle \cdot2B \end{align*} Now as $Ax,Bx$ in $\left(\ast\right)$ are both column vectors I thought I should try this instead $$ \nabla^{2}f\left(x\right)=2A\cdot\left\langle Bx,x\right\rangle +\underset{{\scriptscriptstyle \left(\ast\ast\right)}}{\underbrace{8Ax\cdot\left(Bx\right)^{T}}}+\left\langle Ax,x\right\rangle \cdot2B $$ But that didn't work with my example.

In general I feel the whole process of differentiating functions that are represented by matrices is quite a mystery to me when it comes to where I should transpose and so. Any help is appreciated. Thanks in advance.

2

There are 2 best solutions below

0
On BEST ANSWER

We can write formulas for $f_i$ and $f_{ij}$ (individual first and second partial derivatives) of $f$: $$ f_i(x) = g_i(x)h(x) + g(x)h_i(x) $$ and $$ f_{ij}(x) = g_{ij}(x)h(x) + g_i(x)h_j(x) + g_j(x)h_i(x) + g(x)h_{ij}(x). $$

We can also write the quadratic form $x^{\textrm{T}} A x$ in a form that is easier to differentiate: $$ g(x) = \sum_i \sum_j A_{ij}x_i x_j $$ where $A_{ij}=A_{ji}$ is row $i$, column $j$ of $A$ and $x_i$ is the $i$th variable. So $$ \begin{align} g_k(x) &= \sum_i \sum_j A_{ij}(\delta_{ik}x_j + x_i\delta_{jk}) \\ &= \sum_j A_{kj} x_j + \sum_i A_{ik}x_i \\ &= \sum_i 2A_{ik}x_i \\ &= 2(A_{k*} \cdot x) \end{align} $$ where $\delta_{ij}$ is the Kronecker delta function and $A_{k*}$ is the $k$th row of $A$. The second partial derivative with respect to variables $k$ and $l$ is $$ g_{kl}(x) = \sum_i 2A_{ik}\delta_{il} = 2A_{kl}. $$

Using these formulas for the partial derivatives of $g$ (and $h$) gives the desired result: $$ f_{ij}(x) = 2A_{ij}h(x) + 4(A_{i*}\cdot x)(B_{j*}\cdot x) + 4(A_{j*}\cdot x)(B_{i*}\cdot x) + 2B_{ij}g(x). $$

I derived the identities $\nabla g = 2Ax$ and $\nabla^2 g = 2A$ in component form and then used this to compute the individual components of the Hessian of $f$. The point is that when working with matrices, it is often easier to break everything down into individual components. For example, in a matrix product $PQ$, you would work with $(PQ)_{ij}$ instead of the matrix product itself.

0
On

Your function is the product of the following scalar functions $$\eqalign{ \alpha &= x^TAx \quad\implies d\alpha = (2Ax)^Tdx \\ \beta &= x^TBx \quad\implies d\beta = (2Bx)^Tdx \\ f &= \alpha\beta \\ }$$ Calculate the differential and the gradient of $f$. $$\eqalign{ df &= \alpha\,d\beta + \beta\,d\alpha \\ &= 2(\alpha Bx + \beta Ax)^Tdx \\ \frac{\partial f}{\partial x} &= 2(\alpha Bx + \beta Ax) \;=\; g \qquad ({\rm the\,gradient\,vector}) \\ }$$ Calculate the differential and the gradient of $g$. $$\eqalign{ dg &= 2(\alpha B\,dx + Bx\,d\alpha + \beta A\,dx + Ax\,d\beta) \\ &= 2\left(\alpha B + Bx(2Ax)^T + \beta A + Ax(2Bx)^T\right)dx \\ &= 2\left(\alpha B + 2Bxx^TA + \beta A + 2Axx^TB\right)dx \\ \frac{\partial g}{\partial x} &= 2\alpha B + 4Bxx^TA + 2\beta A + 4Axx^TB \;=\; H \quad({\rm the\,hessian\,matrix})\\ }$$