In Dekker's and Brent's method, in the initial steps, if $|f(a)|<|f(b)|$, we swap $a$ and $b$.
Why is this?
I've searched for reasons why, but I cannot find a reason why.
- Cleve Moler mentions:
$b$ is the best zero so far, in the sense that $f(b)$ is the smallest value of $f(x)$ so far.
- Oscar Veliz (2:51) mentions that it is to keep $b$ as the "better" guess.
But I cannot find an explanation as to why this is.
Because the convention in Dekkers description of the algorithm (but not Wikipedia's Dekker-Brent method article), the loop invariant if you want, is that there are 3 points $a,b,c$ that represent the current state (Brent's method has 4 points, 3 points for inverse quadratic interpolation and one bracketing counter-point):
This loop convention avoids additional logic inside the loop, reduces the cases to consider.
In Dekker's method, the pair of $a$ and $b$ usually contains the last two iterates of the secant method. This sequence of course gets broken if some non-secant step (midpoint or minimal step) was taken.
Given a bracketing interval $[a,b]$, the actual first step of Dekker's method is to assign $c=a$. After that the values at $b,c$ are compared as first step in the loop.
In a "normal" step, when $a,b$ are close to the root inside $[b,c]$ and the function looks almost linear at that "zoom level", the value at the secant root $s$ computed from $a$ and $b$ can be expected to be closer to $b$ than the midpoint $m=\frac12(b+c)$ and to be the new smallest value, esp. $|f(s)|<|f(b)|<|f(a)|$.
If these assumptions fail, then the midpoint and the secant root computed from $b,c$ are perhaps better candidates for the next iteration point $x$ (remember that only the evaluation of $f$ is "costly", all other computations count as "free").
This is almost surely the case when (at the start of the loop) $|f(c)|<|f(b)|<|f(a)|$ (the latter because of the previous step). In that case, the comparison in question at the start of the loop sets $a,b,c = b,c,b$, so it not only swaps $b,c$ in the bracketing interval, but also sets up that the next secant root is computed from the bracketing interval à la regula falsi.
(Here the answer ends, for completeness I add the remaining steps of the algorithm)
If even with these precautions the secant root falls outside $[b,m]$, then the midpoint $m$ is chosen as next point $x$, or a minimal step to get out of numerically ambiguous situations like a double root.
With now $b<x\le m < c$ the state is shifted to $a,b,c = b,x,c$. As the last act inside the loop, if $[b,c]$ is no longer a bracketing interval, the counter-point $c$ is set to $a$ so that the signs of the function values are again opposite.