What does it actually mean if a cost function is differentiable?

2.2k Views Asked by At

I am just learning about optimization, and having trouble understanding the idea behind differentiating cost functions.

I have read that for standard optimization problems, the cost function needs to be differentiable. But I'm not sure which of the following this actually means:

  1. The function is in a form that can be differentiated analytically, such that the derivative of the function is another function that can be written out by hand. E.g. $f(x) = x^2 + 3$, becomes $f'(x) = 2x$. However, in this case, if we want to find the minimum of this function, can we not just set it to $0$ and find the corresponding value of $x$, rather than having to follow the local gradient such as in gradient descent?

  2. The function cannot be differentiated as above, but $f(x)$ can be computed for any value of $x$. In this way, an estimate of the derivative can be found by using the finite difference method, and then gradient descent can be used to keep following the gradient in the desired direction.

Thanks!

4

There are 4 best solutions below

0
On

Answer to 1) This would work fine for polynomial of degree 2 however, what if the polynomial is of a high degree, like 15, try solving the equation after differentiating that. That is when methods like gradient descent, golden ratio methods, newton's method etc come in handy.

2)I am not sure but I think you are referring to the secant method, maybe this would help- https://en.wikipedia.org/wiki/Secant_method

2
On

I think it is neither. That the objective function is differentiable means that it has a derivative. This allows you to use methods like (2) in your question to approximate it in a numerical scheme, or (1) to work with it analytically. It is a restriction, which allows to use many helpful results about the location of the extrema.

That it is differentiable, does not mean you can find out what the derivative is.

BTW, optimization is not limited to differentiable functions all the time :-).

0
On

Statements 1 and 2 characterize different methods based on the concept of "slope" to find, for example, the minimum value of your cost function. Both rely on the notion that your function is differentiable (at least restricted, perhaps, to a region of interest) which implies your function is continuous over that region (no missing function values, no vertical asymptotes, or other pathalogical features that could conspire against the notion of your function having a value at every point in the region) and smooth over that region (no abrupt changes in slope such as what happens to $y=|x|$ around $x=0$ if $0$ is in the region of interest, no level jumps, etc.) The shape of your cost function is important so you'll want to take note if it exhibits any such pathological behavior.

In a practical setting you may know your cost "function" as a set of data points where the conditions for the existence of a derivative of your unobserved function are assumed to be true.

0
On

Cost function is differentiable.. means its graph is smooth without any jumps like a histogram or step function. It is required so that rate of increase/decrease is defined at all points.