I'm working on a problem that involves deriving the Jacobian and Hessian of the following nonlinear least squares function. I've been thinking of expanding the function with Taylor Series expansion then try to match and find the Jacobian and Hessian, but I'm stuck right now.
$f(x) = \sum_1^m [y_i - (a_i^Tx)^2]^2$
Where $x \in \Bbb R^{n}$, $a_i^T$ is the row vector in $x \in \Bbb R^{n}$.
Multiple sources have the derivations in a general form which doesn't allow me to find the gradients in a nice vector-matrix form. Any help/hint would be incredibly helpful.
Here's my attempt to do it in a fairly clean way.
Your function $f$ can be written as $$ f(x) = \sum_{i=1}^m (y_i - h_i(x))^2 $$ where $h_i(x) = (a_i^T x)^2$. By the chain rule, \begin{align} f'(x) &= \sum_{i=1}^m 2(y_i - h_i(x)) (-h_i'(x)) \\ &= \sum_{i=1}^m 2(y_i - (a_i^T x)^2) (-2 (a_i^T x) a_i^T) \\ &= -4 \sum_{i=1}^m (y_i - (a_i^T x)^2) x^T a_i a_i^T. \end{align} Note that $f'(x)$ is a row vector. The gradient of $f$ is the column vector $$ \nabla f(x) = f'(x)^T = -4 \sum_{i=1}^m a_i a_i^T x (y_i - (a_i^T x)^2). $$ The Hessian of $f$ is the derivative of the function $G(x) = \nabla f(x)$. One way to compute the Hessian of $f$ is to notice that
\begin{align} G(x) = \nabla f(x) &= -4 \sum_{i=1}^m a_i g_i(a_i^T x), \end{align} where $g_i(u) = u(y_i - u^2) = u y_i - u^3$. The derivative of $g_i$ is $g_i'(u) = y_i - 3u^2$. So we have \begin{align} Hf(x) = G'(x) &= -4 \sum_{i=1}^m a_i g_i'(a_i^T x) a_i^T \\ &= -4 \sum_{i=1}^m a_i (y_i - 3 (a_i^T x)^2) a_i^T. \end{align}