Hessian matrix of multiplication

35 Views Asked by At

Introduction

Lets assume $A(\vec{x})$, $B(\vec{x})$ and $C(\vec{x})$ are functions with n inputs and 1 (the $\vec{x}$ is nx1 vector) output and function context of A,B and C are known and (it is arbitrary). And i can tell you that I can simply calculate the $H(A)$, $H(B)$ $H(C)$ separately where H is hessian matrix.

Now if we define $f(\vec{x}) = A(\vec{x}) . B(\vec{x}) . C(\vec{x})$ (where dot is normal multiplication), then is there any elegant way to find $H(f)$, by knowing $H(A)$, $H(B)$ $H(C)$ separately? I'm doing this because find Hessian of A,B and C separately, is very simple, but Hessian matrix of A.B.C is hard to calculate analytically.

Example

a very simple 2 variable as example:

$\vec{x} = \{x_1, x_2\}$

$A(\vec{x}) = 3*x_1^2+5*x_2^2+2*x_1+x_2+1$

$B(\vec{x}) = cos(x_2)+cos(x_2)$

$C(\vec{x}) = ln(x_1)+ln(x_2)$

with help of symbolic octave I've found the hessian matrix:

$H(A)=\left[\begin{matrix}6 & 0\\0 & 10\end{matrix}\right]$

$H(B)=\left[\begin{matrix}- \cos{\left(x_{1} \right)} & 0\\0 & - \cos{\left(x_{2} \right)}\end{matrix}\right]$

$H(C)=\left[\begin{matrix}- \frac{1}{x_{1}^{2}} & 0\\0 & - \frac{1}{x_{2}^{2}}\end{matrix}\right]$

$H(A.B.C) =\left[\begin{matrix}- 2 \cdot \left(6 x_{1} + 2\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{1} \right)} + 6 \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right) - \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \cos{\left(x_{1} \right)} + \frac{2 \cdot \left(6 x_{1} +2\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{1}} - \frac{2 \cdot \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{1} \right)}}{x_{1}} - \frac{\left(\cos{\left(x_{1} \right)} +\cos{\left(x_{2} \right)}\right) \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right)}{x_{1}^{2}} & - \left(6 x_{1} + 2\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{2} \right)} - \left(10 x_{2} + 1\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{1} \right)} + \frac{\left(6 x_{1} + 2\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{2}} - \frac{\left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{1} \right)}}{x_{2}} + \frac{\left(10 x_{2} + 1\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{1}} - \frac{\left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{2} \right)}}{x_{1}}\\- \left(6 x_{1} + 2\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{2} \right)} - \left(10 x_{2} + 1\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{1} \right)} + \frac{\left(6 x_{1} + 2\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{2}} - \frac{\left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{1} \right)}}{x_{2}} + \frac{\left(10 x_{2} + 1\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{1}} - \frac{\left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{2} \right)}}{x_{1}} & - 2 \cdot \left(10 x_{2} + 1\right) \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \sin{\left(x_{2} \right)} + 10 \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right) - \left(\log{\left(x_{1} \right)} + \log{\left(x_{2} \right)}\right) \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \cos{\left(x_{2} \right)} + \frac{2 \cdot \left(10 x_{2} +1\right) \left(\cos{\left(x_{1} \right)} + \cos{\left(x_{2} \right)}\right)}{x_{2}} - \frac{2 \cdot \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right) \sin{\left(x_{2} \right)}}{x_{2}} - \frac{\left(\cos{\left(x_{1} \right)} +\cos{\left(x_{2} \right)}\right) \left(3 x_{1}^{2} + 2 x_{1} + 5 x_{2}^{2} + x_{2} + 1\right)}{x_{2}^{2}}\end{matrix}\right]$

As you can see the H(A), H(B) and H(C) are fairly simply to calculate and very few terms in each, but the H(A.B.C) have many terms relatively, and it cannot be calculated analytically for larger problems.

My Question

Is s there any elegant way to find $H(A.B.C)$ (multiplication of A B and C), by knowing A, B, C, $H(A)$, $H(B)$ $H(C)$ and $\nabla(A)$, $\nabla(B)$ and $\nabla(C)$ separately?

Footnote

As you know this is the definition of Hessian Matrix:

$H(f) = \begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1\,\partial x_n} \\[2.2ex] \dfrac{\partial^2 f}{\partial x_2\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2\,\partial x_n} \\[2.2ex] \vdots & \vdots & \ddots & \vdots \\[2.2ex] \dfrac{\partial^2 f}{\partial x_n\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_n\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}$

1

There are 1 best solutions below

1
On BEST ANSWER

Let's start with something simpler, but general enough. We want to compute $H[A*B]$. The Hessian is the Jacobian of $\nabla [A\times B] = \nabla [A]\times B + A\times \nabla [B]$, by the product rule, and therefore, again using the product rule, we get \begin{align*} H[A\times B] &= \nabla^T [\nabla [A]\times B] + \nabla^T [A\times \nabla [B]]\\ &= H[A]\times B+(\nabla [B])^T (\nabla [A]) + (\nabla [A])^T (\nabla [B]) + H[B]\times A \end{align*}

Using this and $\nabla [A\times B] = \nabla [A]\times B + A\times \nabla [B]$, we get \begin{align*} H[A B C] =& H[AB]\times C+(\nabla [C])^T (\nabla [AB]) + (\nabla [AB])^T (\nabla [C]) + H[C]\times AB\\ =&H[A]\times BC+(\nabla [B])^T (\nabla [A])C + (\nabla [A])^T (\nabla [B])C + H[B]\times AC\\&+(\nabla [C])^T \nabla [A]\times B + (\nabla [C])^TA\times \nabla [B]+\nabla [A]^T\nabla [C]\times B + A\times \nabla [B]^T\nabla [C]\\ &+H[C]\times AB \end{align*}

This looks pretty bad, but actually the pattern is not so bad : \begin{align*} H\left[\prod_{i} f_i\right] &= \sum_{i} H[f_i] \prod_{j\neq i} f_j + \sum_{i}\sum_{j\neq i} \nabla [f_i]^T \nabla[f_j] \prod_{k\notin \{ i,j\}} f_k \end{align*}

You can prove this by induction using the above formulas for $\nabla[A\times B]$ and $H[A\times B]$ using $A=\prod_{i=1}^{n-1} f_i$ and $B=f_n$.