Hessian of the norm of a non-linear map

203 Views Asked by Bumbble Comm At 10 May 2026 - 5:16

Suppose $F: \mathbb{R}^n \rightarrow \mathbb{R}^m$ and define the scalar valued map $\Phi(x;y) = \frac{1}{2}\|y - F(x)\|_2^2 $. I am interested in the Hessian of this map written in terms of the second (and first) derivatives of $F$. Namely write $DF: \mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m)$ for the Jacobian of $F$ and $DF^2: \mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathcal{L}(\mathbb{R}^n, \mathbb{R}^m))$ for the second derivative. The gradient of $\Phi$ which I will denote $D \Phi: \mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R^n, \mathbb{R}}) \simeq \mathbb{R}^n$, I can compute as $$D \Phi(x) = DF^T (x)[F(x) - y]$$ but now how do I compute the Hessian $D \Phi^2: \mathbb{R}^n \rightarrow \mathcal{L}(\mathbb{R}^n, \mathcal{L}(\mathbb{R}^n, \mathbb{R})) \simeq \mathcal{L}(\mathbb{R}^n, \mathbb{R}^n)$ in terms of $DF$ and $DF^2$? I'm assuming it should be something like $$D\Phi^2(x) = DF^T (x) DF(x) + DF^2(x)[\cdot,\cdot]$$ but I'm not sure what goes into $[\cdot,\cdot]$. Thanks.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 14 May 2018 - 11:50

It's much easier to do this sort of thing using the Taylor expansion idea, i.e. that $$ \Phi(a+h) = \Phi(a) + D\Phi(a)(h) + D^2\Phi(a)(h,h) + o(\lVert h \rVert^2): $$ if you can create an expansion of this form, you can read off $D^2\Phi$. In this case, $$ \Phi(x+h;y+k) = \lVert y+k-F(x+h) \rVert_2^2 = \lVert y+k-F(x)+DF(x)(h)+D^2F(x)(h,h) + o(\lVert h \rVert^2) \rVert_2^2 \\ = \lVert y-F(x) \rVert_2^2 + 2(y-F(x))^T (k+DF(x)(h)+D^2F(x)(h,h) + o(\lVert h \rVert^2)) + \lVert k+DF(x)(h)+D^2F(x)(h,h) + o(\lVert h \rVert^2) \rVert_2^2 \\ = \Phi(x;y) + 2(y-F(x))^T (k+DF(x)(h)) + 2(y-F(x))^T D^2F(x)(h,h) + \lVert k + DF(x)(h) \rVert_2^2 + o( \lVert k+h \rVert^2 ), $$ from which we read off $$ D\Phi(h;k) = 2(y-F(x))^T (k+DF(x)(h)), \\ D^2\Phi((h;k),(h;k)) = 2(y-F(x))^T D^2F(x)(h,h) + \lVert k + DF(x)(h) \rVert_2^2. $$ While it is possible to simplify this, there are limited benefits to doing so, since these expressions retain the linear maps in forms that keep their arguments in the right places, rather than trying to twist everything to behave like a matrix of some kind.

Hessian of the norm of a non-linear map

There are 1 best solutions below

Related Questions in CALCULUS

Related Questions in MATRICES

Related Questions in MULTIVARIABLE-CALCULUS

Related Questions in DERIVATIVES

Related Questions in HESSIAN-MATRIX

Trending Questions

Popular # Hahtags

Popular Questions