I want to run local optimizers on a multi-dimensional function $f$ with several local minima.
To make sure to trap into the real global optimum, I am running first an initial sampling (e.g. randomly) over a wider space, and get, e.g., $n$ combinations of $(f_i, X_i)$. If I now rank the function values, then it is not given that e.g. the lowest and 2nd lowest fi belong to the same local minimum, so I want to run a kind of cluster analysis upfront to the local minimizations. However, I wonder which algorithms work efficient, would e.g. a $k$-mean cluster analysis fit, and how to apply it?
Usually the number of local minimums is not known, but we can assume that there are usually only 1 to 3, not more. The functions have typically 2 to 7 dimensions, and the initial random sample has e.g. n=50 to 500 points.
One difficulty, I already observed is that if the initial sampling does not cover a really wide space, then it is likely that e.g. even the true global minimum is not hit often, compared to a local optimum in the center of the sampling space.
One brute-force method would be to pick a good sample $(f_o,X_o)$, and to pick also 6 other samples "nearby", and to create a quadratic model. If the Hessian has good properties, then we have a good candidate. Then we might check for another "good" sample regarding $f$, but with $X-X_o$ large enough, but I wonder if there are existing "packaged" methods?
I wonder, if we first filter out all $f_i > f$ (too large), and simply run $k$-mean, do we filter out too many local optimum clusters, so that the over-all optimization would not be better then most local optimization methods?