I have a quadratic form like $$ ( \boldsymbol x - f( \boldsymbol x ) )^T \boldsymbol A ( \boldsymbol x - f( \boldsymbol x ) )$$ with $f(\cdot): \mathbb{R}^n \rightarrow \mathbb{R}^m$.
And I would like to compute the Hessian of it based on Jacobeans and higher order derivatives of $f(\cdot)$.
I already found some hints (Section Basic properties) for the product rule, but couldn't figure out how to apply this further to the Hessian.
Edit:
My current approach: $z(x) = g(x)h(x)$, where $g(x) = (x-f)^T$ and $h(x)=A(x-f)$ The first derivative is then given by $$ D[z] = h^TD[g] + gD[h] $$ When using $u(x):\mathbb{R}^n\rightarrow\mathbb{R}^{m\times p}$, $v(x):\mathbb{R}^n\rightarrow\mathbb{R}^{m\times q}$ from here, page 4, second last equation $$ D[uv] = (v^T\otimes I_m) D[f] + (I_q\otimes u)D[v] $$ Where in my case we have $m=q=1, p=n$. Applying this to the first derivative above I get $$ D^2[z] = (D[g])^T D[h^T] + \{ I_n \otimes h^T \} D^2[g] + (D[h])^T D[g] + \{ I_n\otimes g \} D^2[h]\\\\ = \{ I_n- D[f^T] \}^T\{ I_n - D[f] \}^T A^T + \{ I_n \otimes (x-f)^TA^T \} D^2[g] + \{ I_n -D[f] \}^TA^T\{ I_n - D[f^T] \} + \{ I_n \otimes (x-f)^T \} D^2[h] $$ My questions are now basically? Is that correct? And how are $D[g]=-D^2[f^T]$ and $D^2[h]=-A\cdot{}D[D[f]^T]$ actually defined? Is the former like $$-\begin{bmatrix} D^2[f_1] \\ \vdots \\ D^2[f_n] \end{bmatrix}$$ i.e. stacked Hessians of the elements of $f(x)$ ? But what about the latter?
Let $g(x)=(x-f(x))^TA(x-f(x)$. Then
$Dg_x:H\in \mathbb{R}^n\rightarrow (H-Df_xH)^T(A+A^T)(x-f(x))$ is linear and
$D^2g_x:(H,K)\in (\mathbb{R}^n)^2\rightarrow (H-Df_xH)^T(A+A^T)(K-Df_xK)-D^2f_x(H,K)^T(A+A^T)(x-f(x))$
is a (symmetric) quadratic form.
EDIT 1. $x-f(x)$ is defined only if $m=n$.
Let $f=(f_1,\cdots,f_n)$; then the derivative $Df_x$ is the Jacobian matrix with $(i,j)$ entry $\dfrac{\partial f_i}{\partial x_j}$.
The second derivative is $D^2f_x=[D^2f_{1x},\cdots,D^2f_{nx}]^T$ where $D^2f_{kx}$ is the symmetric matrix (hessian) with $i,j$ entry $\dfrac{\partial^2f_k}{\partial x_ix_j}(x)$ and $D^2f_{kx}(H,K)=H^TD^2f_{kx}K\in\mathbb{R}$.
EDIT 2. Answer to bonanza. $H,K$ are vectors; $Df_xH$ is the product of the $n\times n$ matrix $Df_x$ by the vector $H$; $H$ is a variation of $x$, as the $dx$ of the physicians. We see the usefulness of $H$ in the Taylor formula
$g(x+H)=g(x)+Dg_x(H)+(1/2)D^2g_x(H,H)+O(||H||^3)$.
For $n=2$
$g(x+H)=g(x)+(H-Df_xH)^T(A+A^T)(x-f(x))+$
$\dfrac{1}{2}((H-Df_xH)^T(A+A^T)(H-Df_xH)-[H^TD^2f_{1x}H,H^TD^2f_{2x}H](A+A^T)(x-f(x)))$.