I am just learning about optimization, and having trouble understanding the idea behind differentiating cost functions.
I have read that for standard optimization problems, the cost function needs to be differentiable. But I'm not sure which of the following this actually means:
The function is in a form that can be differentiated analytically, such that the derivative of the function is another function that can be written out by hand. E.g. $f(x) = x^2 + 3$, becomes $f'(x) = 2x$. However, in this case, if we want to find the minimum of this function, can we not just set it to $0$ and find the corresponding value of $x$, rather than having to follow the local gradient such as in gradient descent?
The function cannot be differentiated as above, but $f(x)$ can be computed for any value of $x$. In this way, an estimate of the derivative can be found by using the finite difference method, and then gradient descent can be used to keep following the gradient in the desired direction.
Thanks!
Answer to 1) This would work fine for polynomial of degree 2 however, what if the polynomial is of a high degree, like 15, try solving the equation after differentiating that. That is when methods like gradient descent, golden ratio methods, newton's method etc come in handy.
2)I am not sure but I think you are referring to the secant method, maybe this would help- https://en.wikipedia.org/wiki/Secant_method