Finding the minimum of Rosenbrock's function numerically

299 Views Asked by At

As the introduction for a computing assignment on the Nelder Mead method, I need to do the following:

Find the minimum of Rosenbrock's function numerically.

I'm using the standard variant with $a=1$, $b=100$, $F(x_1, x_2) = (1-x_1)^2+100(x_2-x_1^2)^2 $. There is a near identical question with a great answer here, however the answer doesn't go into great detail about the method they've used. They've used the Newton-Raphson iteration in conjunction with Jacobian matrices with the equation $x_{n+1}=x_n-[J_n]^{-1}\nabla f_n$ in order to find a minimum.

My question is whether the Newton-Raphson method is the best method of doing this and if so, how would I go about finding the inverse of $J_n$, as doing this myself gave a value that seems far too complex to be plausible. I'm also confused as to how to implement the matrices $J_n$ and $f_n$ into the equation in order to actually produce guesses for the minimum.

EDIT: I should have clarified, the question I'm asking is how to solve this numerically. This part of the assignment requires me to solve for the minimum WITHOUT using a computer.

1

There are 1 best solutions below

0
On

In general, it is faster to solve the linear system $Hs = -\nabla f_n$, then update the step like $x_{n+1} = x_n+s$. You can use whatever linear solve you have access to in your software environment for this. The steps should look something like this

while res > tol

H = d2f(x_c)
g = df(x_c)
s = linsolve(H,-g)

x_c = x_c + s
res = norm(f(x_c))

end

Here d2f is a function that returns a $2\times2$ Hessian matrix and df is a function that returns the gradient of $f$. Newton's method may not be a good choice for this problem since the Hessian is not positive-definite everywhere. This means that Newton's method may send iterations away from the minimizer if the inital guess is bad. Quasi-Newton methods such as BFS would be good for this problem, but they are pretty difficult to write yourself. Gradient descent is a pretty simple method that is relatively robust as well. That could be another option.