Basically, Newton's method works best when applied to find non-repeated roots of a differentiable function, because it guarantees quadratic convergence. Now when you want to apply Newton's method to find local minima, you need to apply it to the first derivative of the objective function, which of course means that it can only be used when the objective function is twice differentiable. Also, you need to check that the desired root of the first derivative is not repeated, otherwise the convergence is only linear.
In contrast, gradient descent works to find local minima of any differentiable function. It does not need to be twice differentiable, but as a result of not requiring as much structure as Newton's method, it does not have as good rate of convergence. Also, there are parameters that usually need to be manually tuned.
0
Bumbble Comm
On
The main disadvantage is that each iteration of Newton's method requires solving a large linear system of equations, which for large scale problems can be prohibitively expensive.
Basically, Newton's method works best when applied to find non-repeated roots of a differentiable function, because it guarantees quadratic convergence. Now when you want to apply Newton's method to find local minima, you need to apply it to the first derivative of the objective function, which of course means that it can only be used when the objective function is twice differentiable. Also, you need to check that the desired root of the first derivative is not repeated, otherwise the convergence is only linear.
In contrast, gradient descent works to find local minima of any differentiable function. It does not need to be twice differentiable, but as a result of not requiring as much structure as Newton's method, it does not have as good rate of convergence. Also, there are parameters that usually need to be manually tuned.