Why doesn't the Gauss-Newton method diverge around the minimum?

275 Views Asked by At

I am having trouble visualizing the convergence of the Gauss-Newton method. Consider the simple function of $f(x) = x^2 + 1$. If I try to use the Gauss-Newton method to find the minimum of $f(x)^2$ (which is clearly at $x = 0$), and say the current guess is $\hat{x} = 0.1$, my understanding is that the Gauss-Newton method would approximate $f(x)$ with $\hat{f}(x) = 0.2x + 1.01$, and find $\text{arg}\min_x\hat{f}(x)^2$ instead, which is at $x = -5.05$. This new estimation is clearly worse than the original $x = 0.1$. How can Gauss-Newton converge on simple functions like this one if it seems to diverge every time it gets close to the minimum?

1

There are 1 best solutions below

2
On

The Gauss-Newton method is just the Newton method applied to functions of the form $\frac{1}{2}\sum_{i=1}^nr_i^2(x)$, where the Hessian is approximated by first derivatives only. We have \begin{align} f(x)=(1+x^2)^2=x^4+2x^2+1=\frac{1}{2}\left[(\sqrt{2}x^2)^2+(2x)^2+(\sqrt{2})^2\right]. \end{align} The Gauss-Newton method gives: \begin{align} \mathbf{r}&=[\sqrt{2}x^2,~2x,~\sqrt{2}],\\ \nabla\mathbf{r}&=[2\sqrt{2}x,~2,~0],\\ (\nabla f)(x)&=\sum_{i=1}^3r_i\nabla r_i = 4x(x^2+1),\\ (\nabla^2 f)(x)&\approx \sum_{i=1}^3\nabla r_i\nabla r_i = 4(2x^2+1)~~~~(\text{G.-N.~approximation}), \end{align} therefore \begin{align} x_{k+1}=x_k-\frac{(\nabla f)(x)}{(\nabla^2 f)(x)}=x_k-\frac{x_k(x_k^2+1)}{2x_k^2+1}=\frac{x_k^3}{2x_k^2+1}. \end{align} which iterates fast to $0$: \begin{align} \text{iter} = 0, x &= 0.1\\ \text{iter} = 1, x &= 0.0009803921568627453\\ \text{iter} = 2, x &= 9.423205230889076\text{E}-10\\ \text{iter} = 3, x &= 8.367504403129781\text{E}-28\\ \end{align}