Hessian Matrix of an Angle in Terms of the Vertices

Question

Hessian Matrix of an Angle in Terms of the Vertices

426 Views Asked by Bumbble Comm At 30 Mar 2026 - 8:24

I am attempting to derive the analytical formula for the Hessian matrix of a the second derivatives of the value of an angle in terms of the (9) coordinates of the 3 3D points that define it. While I have derived a formula for this (given below), when I attempt to validate it numerically, I get incorrect answers; e.g., non-symmetric matrices. I suspect that I have misunderstood or misapplied some principle of vector calculus in coming up with these formulae and would appreciate it if anyone can either (a) find the error, (b) provide an alternate formula, or (c) point me toward a derivation elsewhere.

Consider the following three points:

$a = (x_a, y_a, z_a)\\ b = (x_b, y_b, z_b)\\ c = (x_c, y_c, z_c)$

Let $\theta = \arccos\left( \frac{b - a}{||b - a||} \cdot \frac{c - a}{||c - a||} \right)$ be the angle between vectors $\mathbf{v}_{ab} = b-a$ and $\mathbf{v}_{ac} = c-a$. Additionally, let $\mathbf{u}_{ab} = \frac{b - a}{||b - a||}$ and $\mathbf{u}_{ac} = \frac{c - a}{||c - a||}$.

I am performing a minimization for which I need a closed form of both the gradient vector and Hessian matrix of $\theta$ in terms of the coordinates of the points $a$, $b$, and $c$. The details of the problem are not important aside from the fact that estimation is not an option due to the size of the problem. Formally, the formula I need are:

$\nabla \theta = \begin{pmatrix} \partial\theta/\partial x_a\\ \partial\theta/\partial y_a\\ \partial\theta/\partial z_a\\ \vdots\\ \partial\theta/\partial z_c \end{pmatrix}\;$ and $\;\mathbf{H}(\theta) = \begin{pmatrix} \partial^2 \theta / \partial x_a^2 & \partial^2 \theta / \partial x_a \partial y_a & \cdots & \partial^2 \theta / \partial x_a \partial z_c \\ \partial^2 \theta / \partial y_a \partial x_a & \partial^2 \theta / \partial y_a^2 & \cdots & \partial^2 \theta / \partial y_a \partial z_c \\ \vdots & \vdots & \ddots & \vdots \\ \partial^2 \theta / \partial z_c \partial x_z & \partial^2 \theta / \partial z_c \partial y_a & \cdots & \partial^2 \theta / \partial z_c^2 \end{pmatrix}$

I have already found the gradient analytically and tested it numerically; I define it as follows:

$\nabla \theta = \begin{pmatrix} \theta_a \\ \theta_b \\ \theta_c \end{pmatrix}$ where $\theta_a$, $\theta_b$, and $\theta_c$ are column vectors representing $\nabla \theta$ in terms of points $a$, $b$, and $c$:

$\theta_a = \begin{pmatrix} \partial \theta / \partial x_a \\ \partial \theta / \partial y_a \\ \partial \theta / \partial z_a \\\end{pmatrix} = -(\theta_b + \theta_c) \\ \theta_b = \begin{pmatrix} \partial \theta / \partial x_b \\ \partial \theta / \partial y_b \\ \partial \theta / \partial z_b \\\end{pmatrix} = \frac{1}{||c-b||}(\mathbf{u}_{ab} \cot\theta - \mathbf{u}_{ac} \csc\theta)\\ \theta_c = \begin{pmatrix} \partial \theta / \partial x_c \\ \partial \theta / \partial y_c \\ \partial \theta / \partial z_c \\\end{pmatrix} = \frac{1}{||c-b||}(\mathbf{u}_{ac} \cot\theta - \mathbf{u}_{ab} \csc\theta) $

I would like to derive the Hessian using this gradient as a starting place; i.e., if I could find $\mathbf{J}(\theta_b)$ and $\mathbf{J}(\theta_c)$ (where $\mathbf{J}(f)$ is the Jacobian matrix of $f$) in terms of the coordinates $x_a, y_a ... z_c$, expressing the Hessian would be a trivial matter of packing the Jacobians into a larger matrix:

$\mathbf{H}(\theta) = \begin{pmatrix} \mathbf{J}(\theta_a) & \mathbf{J}(\theta_b) & \mathbf{J}(\theta_c) \end{pmatrix}$

To further simplify, I can split the Jacobians further into the following $3\times 3$ sub-matrices and express the Hessian using them:

$\mathbf{H}(\theta) = \begin{pmatrix} \theta_{aa} & \theta_{ba} & \theta_{ca} \\ \theta_{ab} & \theta_{bb} & \theta_{cb} \\ \theta_{ac} & \theta_{bc} & \theta_{cc} \end{pmatrix}$ where

$\theta_{ij} = \begin{pmatrix} \partial \theta_i / \partial x_j & \partial \theta_i / \partial y_j & \partial \theta_i / \partial z_j \end{pmatrix} = \begin{pmatrix} \partial^2 \theta / \partial x_i \partial x_j & \partial^2 \theta / \partial x_i \partial y_j & \partial^2 \theta / \partial x_i \partial z_j \\ \partial^2 \theta / \partial y_i \partial x_j & \partial^2 \theta / \partial y_i \partial y_j & \partial^2 \theta / \partial y_i \partial z_j \\ \partial^2 \theta / \partial z_i \partial x_j & \partial^2 \theta / \partial z_i \partial y_j & \partial^2 \theta / \partial z_i \partial z_j \\ \end{pmatrix}$.

Note that $\theta_{ij} = \theta_{ji}^\intercal$.

I now go about deriving the functions $\theta_{ij}$, starting with $\theta_{ba}$, $\theta_{bb}$, $\theta_{bc}$, $\theta_{ca}$, and $\theta_{cc}$. Using these, the remainder will be trivial to derive. I formulate these matrices by considering the dot product of a column vector, $\theta_i$ with $\nabla_j^\intercal = \begin{pmatrix} \partial/\partial x_j & \partial/\partial y_j & \partial/\partial z_j \end{pmatrix}$: $\theta_{ij} = \theta_i \cdot \nabla_j^\intercal$. To simplify the derivation slightly, note that the following can be easily defined:

$\mathbf{J}_i(\mathbf{u}_{ij}) = \begin{pmatrix} \partial x_{\mathbf{u}_{ij}} / \partial x_i & \partial x_{\mathbf{u}_{ij}} / \partial y_i & \partial x_{\mathbf{u}_{ij}} / \partial z_i \\ \partial y_{\mathbf{u}_{ij}} / \partial x_i & \partial y_{\mathbf{u}_{ij}} / \partial y_i & \partial y_{\mathbf{u}_{ij}} / \partial z_i \\ \partial z_{\mathbf{u}_{ij}} / \partial x_i & \partial z_{\mathbf{u}_{ij}} / \partial y_i & \partial z_{\mathbf{u}_{ij}} / \partial z_i \\ \end{pmatrix} = \mathbf{v}_{ij} \cdot \mathbf{v}_{ij}^\intercal - ||\mathbf{v}_{ij}||^2 \mathbf{I}$.

Additionally, if we let $\gamma(i,j) = 1/||\mathbf{v}_{ij}||$ (this cleans up the notation slightly), then we have:

$\nabla_i \gamma(i,j) = \begin{pmatrix} \partial \gamma(i,j) / \partial x_i \\ \partial \gamma(i,j) / \partial y_i \\ \partial \gamma(i,j) / \partial z_i \end{pmatrix} = \mathbf{v}_{ij} / ||\mathbf{v}_{ij}||^3$.

We can now begin the derivation; using the notation defined above, we can start with $\theta_{ba}$:

$\begin{align} \theta_{ba} &= \theta_b \cdot \nabla_a^\intercal \\ &= (\gamma(b,c) \mathbf{u}_{ab}\cot\theta - \gamma(b,c) \mathbf{u}_{ac}\csc\theta) \cdot \nabla_a^\intercal \\ &= (\gamma(b,c) \mathbf{u}_{ab}\cot\theta)\cdot \nabla_a^\intercal - (\gamma(b,c) \mathbf{u}_{ac}\csc\theta) \cdot \nabla_a^\intercal \\ &= \gamma(b,c)( (\cot\theta\, \mathbf{J}_a(\mathbf{u}_{ab}) - \csc^2\theta\, (\mathbf{u}_{ab} \cdot \theta_a^\intercal)) - (\csc\theta\, \mathbf{J}_a(\mathbf{u}_{ac}) - \csc\theta \cot\theta\, (\mathbf{u}_{ac} \cdot \theta_a^\intercal)) \end{align}$

The remainder are given below, but note that even looking just at $\theta_{ba}$, the formula given here does not produce answers that match numerical derivations of the Hessian.

$\begin{align} \theta_{bb} &= \theta_b \cdot \nabla_b^\intercal = (\gamma(b,c) \mathbf{u}_{ab}\cot\theta - \gamma(b,c) \mathbf{u}_{ac}\csc\theta) \cdot \nabla_b^\intercal \\ &= \cot\theta\,(\mathbf{u}_{ab} \cdot \nabla_b\gamma(b,c)^\intercal) + \gamma(b,c)\cot\theta\,\mathbf{J}_b(\mathbf{u}_{ab}) - \gamma(b,c)\csc^2\theta\,(\mathbf{u}_{ab} \cdot \theta_b^\intercal) - \csc\theta\,(\mathbf{u}_{ac} \cdot \nabla_b\gamma(b,c)^\intercal) + \gamma(b,c)\csc\theta\cot\theta\,(\mathbf{u}_{ac}\cdot\theta_b^\intercal) \end{align}$

$\begin{align} \theta_{bc} &= \theta_b \cdot \nabla_c^\intercal = (\gamma(b,c) \mathbf{u}_{ab}\cot\theta - \gamma(b,c) \mathbf{u}_{ac}\csc\theta) \cdot \nabla_c^\intercal \\ &= \cot\theta\,(\mathbf{u}_{ab} \cdot \nabla_c\gamma(b,c)^\intercal) - \gamma(b,c)\csc^2\theta\,(\mathbf{u}_{ab} \cdot \theta_c^\intercal) - \csc\theta\,(\mathbf{u}_{ac} \cdot \nabla_\gamma(b,c)^\intercal) - \gamma(b,c)\csc\theta\,\mathbf{J}_c(\mathbf{u}_{ac}) + \gamma(b,c)\csc\theta\cot\theta\,(\mathbf{u}_{ac}\cdot\theta_c^\intercal) \end{align}$

$\begin{align} \theta_{ca} &= \theta_c \cdot \nabla_a^\intercal = (\gamma(b,c)\cot\theta\,\mathbf{u}_{ac} + \gamma(b,c)\csc\theta\,\mathbf{u}_{ab}) \cdot \nabla_a^\intercal \\ &= \gamma(b,c)( \cot\theta\,\mathbf{J}_a(\mathbf{u}_{ac}) - \csc^2\theta\,(\mathbf{u}_{ac} \cdot \theta_a^\intercal) - \csc\theta\,\mathbf{J}_a(\mathbf{u}_{ab}) + \csc\theta\cot\theta\,(\mathbf{u}_{ab} \cdot \theta_a^\intercal) \end{align}$

$\begin{align} \theta_{cc} &= \theta_c \cdot \nabla_c^\intercal = (\gamma(b,c)\cot\theta\,\mathbf{u}_{ac} + \gamma(b,c)\csc\theta\,\mathbf{u}_{ab}) \cdot \nabla_c^\intercal \\ &= \cot\theta\,(\mathbf{u}_{ac} \cdot \nabla_c\gamma(b,c)^\intercal) + \gamma(b,c)\cot\theta\,\mathbf{J}_c(\mathbf{u}_{ac}) - \gamma(b,c)\csc^2\theta\, (\mathbf{u}_{ac} \cdot \theta_c^\intercal) - \csc\theta\,(\mathbf{u}_{ab} \cdot \nabla_c\gamma(b,c)^\intercal) + \gamma(b,c)\cot\theta\csc\theta\,(\mathbf{u}_{ab} \cdot \theta_c^\intercal) \end{align}$

Original Q&A

There are 3 best solutions below

**Bumbble Comm** · Answer 1 · 2015-08-31 12:11:54

Many years ago I had similar tasks in computer geometry. The formula is too complicated and it is too easy to miss it. So I decided to develop a C++ template class that holds some quantity --- a number --- together with its first two derivatives with respect to some real parameters. (The number of the real parameters is a template parameters) So the class contains a value, a gradient vector and a Hessian matix. Then I implemented the usual arithmetic operations and functions (+, *, /, exp, log, atan etc.) working with that class. Having this code, we just need to build the formula (e.g. compute the error in the surface fitted) and we will have the derivatives wrt. the real parameters --- everything we need for Newton-Raphson.

**Bumbble Comm** · Answer 2 · 2015-08-31 19:51:46

Rather than using your trigonometric expressions, I'd opt to stick with the vector and polynomial calculations throughout. I think the definition $$ g(a,b,c) = (b-a ) \cdot (c-a)/{||b-a|| ||c-a||}$$ along with the identities $$ \nabla_b g(a,b,c) = \frac{(c-a) ||b-a||^2 - \left( (b-a) \cdot (c-a) \right) (b-a)}{||b-a||^3 ||c-a||},$$ and similarly for $\nabla_cg $ , are superior to the trig versions... These have the clear advantage of highlighting the geometric intuition $$ \nabla_b g(a,b,c) \cdot (b-a) = 0.$$

Also, your definition of $J_i(u_{ij})$ is unclear and the subsequent computations look cluttered. I don't think that $J$ is helping you....

Continuing along....

By the chain rule, $$ \nabla_b \theta = - 2\frac{g}{\sqrt{1 - g^2}} \nabla_b g$$ and so as $$H_{b b}(\theta) = \left( \begin{matrix} \nabla_b [\partial_{b_1} \theta] & \nabla_b [\partial_{b_2} \theta] & \nabla_{b} [\partial_{b_3} \theta] \end{matrix} \right),$$

you should be able to use the identities above to get the complete form for $H_{bb}$. For example, $$ \nabla_b [ \partial_{b_1} \theta ] = - 2 \left( \frac{(1+g^2)[g_{b_1}\nabla_b g] + g \nabla_b g_{b_1} (1 - g^2)}{1 - g^2}\right).$$ All that is left is the second derivatives of $g$ which look like

$$ \nabla_b g_{b_1} = \frac{\left( (2 \vec{(b-a)} (c_1 - a_1) - \vec{(1,0,0)} [(b-a) \cdot (c-a)] - \vec{(c-a)} b_1 \right) (||b-a||^3 ||c-a||) - \left( \vec{b - a} [6 ||b-a|| ||c-a|| (\nabla_{b} g)_1] \right)}{||b-a||^6 ||c-a||^2}$$ where I have used subscripts for vector indices. Not pretty but clearer than before, no?

${\bf \text{A suggestion}.}$ Along an algebraic line of thinking: the vectors $(0,0,0,(b-a),0,0,0)$ and $(0,0,0,0,0,0,(c-a ))$ are clearly in the null space of $H$ at $x = (a, b, c)$, so as the null space is closed and $(x,x,x)$ is also in the null space (as the angle is shift-independent)) you get that $(a,b,c)$ is in the null space---clear if you think about scaling the vectors. There is also some rotation invariance. I'm not sure about the $9-D$ rotation, but the $3-D$ rotation of each of $a,b,c$ along the same axis yields an invariant transformation for $\theta$. Since you have so many invariants and the rank seems to be at most $6$, perhaps a reparametrization is appropriate for a more elegant computation of the Hessian.

**Bumbble Comm** · Answer 3 · 2015-09-01 22:53:58

The OP was a bit loose with showing their work, so I've expanded the derivations here, more-or-less trusting their numerically-checked derivations along the way:

$$ \begin{align} \theta_{ba} &= \theta_b \cdot \nabla_a^\intercal \\ &= \gamma(b,c)(\cot\theta\, \mathbf{u}_{ab} - \csc\theta\, \mathbf{u}_{ac}) \cdot \nabla_a^\intercal \\ \\ &\mbox{[Note that }\gamma(b,c)\mbox{ is a constant w.r.t. }\nabla_a\mbox{]}\\ \\ \theta_{ba}/\gamma(b,c) &= (\cot\theta\, \mathbf{u}_{ab} - \csc\theta\, \mathbf{u}_{ac}) \cdot \nabla_a^\intercal \\ &= \begin{pmatrix} \cot\theta\, x_{\mathbf{u}_{ab}} \\ \cot\theta\, y_{\mathbf{u}_{ab}} \\ \cot\theta\, z_{\mathbf{u}_{ab}} \\ \end{pmatrix} \cdot \nabla_a^\intercal - \begin{pmatrix} \csc\theta\, x_{\mathbf{u}_{ac}} \\ \csc\theta\, y_{\mathbf{u}_{ac}} \\ \csc\theta\, z_{\mathbf{u}_{ac}} \\ \end{pmatrix} \cdot \nabla_a^\intercal \\ &= \begin{pmatrix} (\cot\theta\,\nabla_a^\intercal) x_{\mathbf{u}_{ab}} + (x_{\mathbf{u}_{ab}} \nabla_a^\intercal)\cot\theta \\ (\cot\theta\,\nabla_a^\intercal) y_{\mathbf{u}_{ab}} + (y_{\mathbf{u}_{ab}} \nabla_a^\intercal)\cot\theta \\ (\cot\theta\,\nabla_a^\intercal) z_{\mathbf{u}_{ab}} + (z_{\mathbf{u}_{ab}} \nabla_a^\intercal)\cot\theta \\ \end{pmatrix} - \begin{pmatrix} (\csc\theta\,\nabla_a^\intercal) x_{\mathbf{u}_{ac}} + (x_{\mathbf{u}_{ac}} \nabla_a^\intercal)\csc\theta \\ (\csc\theta\,\nabla_a^\intercal) y_{\mathbf{u}_{ac}} + (y_{\mathbf{u}_{ac}} \nabla_a^\intercal)\csc\theta \\ (\csc\theta\,\nabla_a^\intercal) z_{\mathbf{u}_{ac}} + (z_{\mathbf{u}_{ac}} \nabla_a^\intercal)\csc\theta \\ \end{pmatrix}\\ &= \begin{pmatrix} (\theta \nabla_a^\intercal)(-\csc^2\theta)\,x_\mathbf{u_{ab}} \\ (\theta \nabla_a^\intercal)(-\csc^2\theta)\,y_\mathbf{u_{ab}} \\ (\theta \nabla_a^\intercal)(-\csc^2\theta)\,z_\mathbf{u_{ab}} \\ \end{pmatrix} + (\mathbf{u}_{ab} \cdot \nabla_a^\intercal)\cot\theta - \begin{pmatrix} (\theta \nabla_a^\intercal)(-\csc\theta\cot\theta)\,x_\mathbf{u_{ac}} \\ (\theta \nabla_a^\intercal)(-\csc\theta\cot\theta)\,y_\mathbf{u_{ac}} \\ (\theta \nabla_a^\intercal)(-\csc\theta\cot\theta)\,z_\mathbf{u_{ac}} \\ \end{pmatrix} - (\mathbf{u}_{ac} \cdot \nabla_a^\intercal)\csc\theta \\ &= (-\csc^2\theta)\begin{pmatrix} (\theta \nabla_a^\intercal)\,x_\mathbf{u_{ab}} \\ (\theta \nabla_a^\intercal)\,y_\mathbf{u_{ab}} \\ (\theta \nabla_a^\intercal)\,z_\mathbf{u_{ab}} \\ \end{pmatrix} + (\mathbf{u}_{ab} \cdot \nabla_a^\intercal)\cot\theta - (-\csc\theta\cot\theta)\begin{pmatrix} (\theta \nabla_a^\intercal)\,x_\mathbf{u_{ac}} \\ (\theta \nabla_a^\intercal)\,y_\mathbf{u_{ac}} \\ (\theta \nabla_a^\intercal)\,z_\mathbf{u_{ac}} \\ \end{pmatrix} - (\mathbf{u}_{ac} \cdot \nabla_a^\intercal)\csc\theta \\ &= (\mathbf{u}_{ab} \cdot \nabla_a^\intercal)\cot\theta - (\mathbf{u}_{ab} \cdot \theta_a^\intercal)\csc^2\theta - (\mathbf{u}_{ac} \cdot \nabla_a^\intercal)\csc\theta + (\mathbf{u}_{ac} \cdot \theta_a^\intercal)\csc\theta\cot\theta \\ &= (\mathbf{u}_{ab} \cdot \nabla_a^\intercal)\cot\theta - (\mathbf{u}_{ac} \cdot \nabla_a^\intercal)\csc\theta + \csc\theta(\cot\theta\,\mathbf{u}_{ac} - \csc\theta\,\mathbf{u}_{ab}) \cdot \theta_a^\intercal \end{align} $$

If we assume that the OP's $\mathbf{J}_i(\mathbf{u}_{ij}) = (\mathbf{u}_{ij}\cdot \nabla_i^\intercal)$, where $i$ and $j$ are one of $a$, $b$, or $c$ (I have not verified the OP's Jacobian formula) then this gives us:

$\theta_{ba} = \gamma(b,c)\left( \mathbf{J}_a(\mathbf{u}_{ab})\cot\theta - \mathbf{J}_a(\mathbf{u}_{ac})\csc\theta + \csc\theta(\cot\theta\,\mathbf{u}_{ac} - \csc\theta\,\mathbf{u}_{ab}) \cdot \theta_a^\intercal\right)$

which seems to be identical to the OP's formula. If the expanded calculus above is correct as far as anyone can tell (please comment if you see errors), then the OP's errors are likely due to coding errors or errors in the derivation of the Jacobian formulas.

Hessian Matrix of an Angle in Terms of the Vertices

There are 3 best solutions below

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in PARTIAL-DERIVATIVE

Related Questions in ANGLE

Trending Questions

Popular # Hahtags

Popular Questions