I'm studying the Golden Section Search (https://en.wikipedia.org/wiki/Golden-section_search). As I understood, the algorithm should be fed with a minimum of two initial points a and b, and the function should be unimodal between a and b.
I'm trying to find a good solution for an initial guess for 'b', given an 'a'. It seems that it is more usual to ask the user (For example, the SciPy implementation asks for [a,b,c] or [a,b]: https://github.com/scipy/scipy/blob/v0.14.0/scipy/optimize/optimize.py#L1901). I couldn't find any solution to estimate b - I also don't know how hard this problem is.
I tried two things so far:
1) Given 'a' and a 'step', increment 'step' until f(a) < f(a + step_{k-1}) < f(a + step_k).
This approach seems to be very bad, since it seem to decrease the overall performance of the golden search algorithm. Also, there is no guarantee that the step didn't "jumped over" a minimum.
2) Given 'a', run x_ = golden_search(a, a + step_k) until f(x_) is minimum (first/second order tests) or make step_{k+1} = step_k + eps.
Do someone have any better solution, or see any other problem on the two ideas I suggested?
If we assume that the function $f(x)$ is strictly unimodal on $[a, \infty)$ with a minimum at some point $x^*$ such that $x^*>a$, then the problem is at least theoretically solvable. (Johan points out in the comments what goes wrong if $f(x)$ is not strictly unimodal, and is constant on large intervals, but that's a problem for the usual golden-section search as well.)
In this case, by analogy with exponential search for finding zeroes of $f(x)$, we could try a variant of your approach 1 where the step sizes increase exponentially. For instance, evaluate $$f(a+1), f(a + \phi), f(a + \phi^2), f(a + \phi^3), \dots$$ until $f(a + \phi^{k-1}) < f(a + \phi^k)$. Then run the usual golden-section search starting with the three points $(a, a + \phi^{k-1}, a + \phi^k)$.
(The value $f(a+1)$ to start with is arbitrary; we could start with $f(a+\delta)$ for any $\delta>0$, and continue $f(a+\delta \phi), f(a+\delta \phi^2), \dots$. One practical concern with picking $\delta$ is to choose a value large enough that $a+\delta$ is distinguishable from $a$ in your floating-point representation.)
We obviously can't make any guarantees about the running time independent of $x^*$; no matter how many points you try, it's possible that $x^*$ is bigger than all of them. However, this method is guaranteed to find a value of $b$ in $\mathcal O(\log (x^* - a))$ steps.
This is not an unreasonable running time given that, once you've found $b$, you'll need $\mathcal O(\log (b - a))$ steps to estimate $x^*$ to a desired level of precision. In other words, using this search strategy, we don't increase the running time by more than a constant factor of what it would be if we were given $b$ ahead of time.