Disadvantage of Newton method in optimization compared with gradient descent

5.4k Views Asked by At

In optimization with Newton method in wikipedia, there is a diagram showing Newton's method in optimization is much faster than gradient descent.

What is the disadvantage of Newton's method compared with gradient descent?

enter image description here

2

There are 2 best solutions below

0
On BEST ANSWER

Basically, Newton's method works best when applied to find non-repeated roots of a differentiable function, because it guarantees quadratic convergence. Now when you want to apply Newton's method to find local minima, you need to apply it to the first derivative of the objective function, which of course means that it can only be used when the objective function is twice differentiable. Also, you need to check that the desired root of the first derivative is not repeated, otherwise the convergence is only linear.

In contrast, gradient descent works to find local minima of any differentiable function. It does not need to be twice differentiable, but as a result of not requiring as much structure as Newton's method, it does not have as good rate of convergence. Also, there are parameters that usually need to be manually tuned.

0
On

The main disadvantage is that each iteration of Newton's method requires solving a large linear system of equations, which for large scale problems can be prohibitively expensive.