I am trying to figure out why functions that are convex (convex everywhere) are easier to minimize than functions that are NOT convex everywhere.
My main theory was that non-convex functions have multiple extrema (local minima and maxima) and thus the algorithm could get "trapped" in one of those local minimums.
However, I then came up with a function that is not convex yet has only one minimum:
Sorry for the horrible drawings! I dont know the equation for this function.
As you can see, this function has regions of convexity and concavity (2nd derivative > 0 and <0, respectively), yet only one extrema.
So why would this function I drew be harder to minimize than just a normal, "non-wavy" parabola?
EDIT: A commenter pointed out that my function $f(x)$ is quasiconvex which means maybe my initial theory about multiple extrema being the major challenge is true for all non-convex AND non-quasiconvex functions.