Disadvantage of Newton method in optimization compared with gradient descent

$\begingroup$

In optimization with Newton method in wikipedia, there is a diagram showing Newton's method in optimization is much faster than gradient descent.

What is the disadvantage of Newton's method compared with gradient descent?

enter image description here

$\endgroup$

2 Answers

$\begingroup$

Basically, Newton's method works best when applied to find non-repeated roots of a differentiable function, because it guarantees quadratic convergence. Now when you want to apply Newton's method to find local minima, you need to apply it to the first derivative of the objective function, which of course means that it can only be used when the objective function is twice differentiable. Also, you need to check that the desired root of the first derivative is not repeated, otherwise the convergence is only linear.

In contrast, gradient descent works to find local minima of any differentiable function. It does not need to be twice differentiable, but as a result of not requiring as much structure as Newton's method, it does not have as good rate of convergence. Also, there are parameters that usually need to be manually tuned.

$\endgroup$ $\begingroup$

The main disadvantage is that each iteration of Newton's method requires solving a large linear system of equations, which for large scale problems can be prohibitively expensive.

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

You Might Also Like