How does the damped Gauss-Newton-Method work?

515 Views Asked by At

I am not sure how I could apply a line search strategy to the Gauss-Newton-Method to solve nonlinear least square problems.

In a least square problem we have a function $f(t, x_1, x_2, ..., x_n)$ and a dataset

\begin{array}{|l|c|r|} \hline & t & y\\ \hline 1 & t_1 & y_1\\ \hline 2 & t_2 & y_2\\ \hline \vdots & \vdots & \vdots\\ \hline \end{array}

where $y$ is the observed output of $f$ for $t$.

For G-N to work, we need a start vector $x_0 = \begin{bmatrix}x_1\\x_2\\\vdots\\x_n\end{bmatrix}$

We define the residuum as:

$r = \begin{bmatrix}f(t_1, x) - y_1\\ f(t_2, x) - y_2\\ \vdots\end{bmatrix}$

Furthermore, we define Jacobi as

$J = \begin{bmatrix}\frac{df}{dx_1}(t_1, x) & \frac{df}{dx_2}(t_1, x) & \ldots\\ \frac{df}{dx_1}(t_2, x) & \frac{df}{dx_2}(t_2, x) & \ldots\\ \vdots & \vdots & \vdots \end{bmatrix}$

In each iteration of G-N, we can calculate the direction vector like this:

$J(x)^T \cdot J(x) \cdot d = J(x)^T\cdot r(x)$

For a line-search approach such as Wolfe-Powell, we need:

$\varphi(\alpha) = f(x + \alpha d)$ and $\varphi'(\alpha) = \nabla f(x + \alpha d)^T\cdot d$

In G-N, $\nabla f(x)$ is defined as $J(x)^T\cdot r(x)$. With this definition, we are able to use $\varphi'$ but I don't know how I can use $\varphi$? Which element $t$ of the dataset should I pass in?

I am looking forward to a explanation because pretty much all literature I found on that topic doesn't go into much detail on how this works.

In case I made mistakes in the definition of a least-squares problem, please help me out with that too.