To what type of objective/cost functions can I apply Adagrad, Adam etc.?

115 Views Asked by At

What range of functions can be minimised using algorithms like Adagrad, Adam, Gradient descent, nesterov momentum etc? I know these are used for solving neural networks, linear regression etc, but what is the range of the functions that they cover? Is it only convex functions ? Monotonic ?

1

There are 1 best solutions below

0
On BEST ANSWER

These methods are "first order" methods that make use of the gradient of the objective function. Thus you need to have a differentiable objective function to start with.

You can try to use methods in the general family of gradient descent methods to minimize a smooth (at least differentiable) non-convex function, but these methods won't ensure convergence to a global minimizer of the function. If the function is also convex, then any local minimizer is a global minimizer and you're all set.

In some cases, you can establish convergence of these methods to a local minimizer of a smooth non-convex optimization problem. In the literature on nonlinear programming, "global convergence" results are typically "converges to a local minimizer from any starting point" results.

There are also related methods (based on subgradients) which can be used to find a global minimizer of a convex but non-smooth objective function.