Consider a problem of minimization $f(x)\to\min$, e.g. nonlinear least squares problem. This problem might have more than one local solution, i.e. $\nabla f(x)=0$.
Apply gradient descent from some initial guess $x_0$. Suppose that the algorithm comes to the point $a$ missing the point $b$ that is located between $x_0$ and $a$ (in the parallelepiped $[x_0,a]$).
I would like to exclude the entire parallelepiped $[x_0,a]$ when solving the global optimization problem. My question is how can I characterize the situations when there can be a local minimizer $b$ inside the parallelepiped $[x_0,a]$ with $f(b)<f(a)$? What are conditions forbidding such missing local minimizers inside the parallelepiped? I expect that in most cases we may exclude the parallelepiped from the initial guess to the final point (local minimizer that is found by gradient descent). Note that the algorithm can be modelled with gradient flow.
As much as it would be nice to exclude that entire chunk of points, this unfortunately can't be done for a generic nonlinear optimization problem.
One condition which will work is that, if $f$ is locally convex on $[x_0,a]=\{\lambda x_0+(1-\lambda)a\,|\,\lambda\in[0,1]\}$, and we have a local min at $a$, then you can exclude the interval $[x_0,a]$. However, this unfortunately is not saying much. If, for instance, the function is globally convex, then this means that all local minimizers are also global minimizers. So the only way we can go to get this partial exclusion via a notion of convexity is to have this notion of local convexity.
Another condition would be if, for some reason, the algorithm which returns $a$ gives you a guarantee on the radius of optimality $\tau$ -- then if $[x_0,a]$ resides in that radius, we'd be fine.
However, oftentimes these conditions are hard to verify in-practice.