Bisection method with geometric mean

683 Views Asked by At

The bisection method is a well-known method for root-finding. Given a continuous function $f$ and an interval $[a,b]$ where $f(a)$ and $f(b)$ have opposite signs, a root can be guaranteed to be in $(a,b)$. The bisection method computes $f(\frac{a+b}2)$ and iteratively refines the interval based on its sign. The main advantage with this is the simplicity and guaranteed linear convergence since on every iteration the error could be said to halve.

In floating point arithmetic, however, the float which is directly in between $a$ and $b$ is not given by $\frac{a+b}2$ but rather $\sqrt{ab}$, assuming $a$ and $b$ are both positive. For this reason I wonder if it is actually advantageous to use the geometric mean instead of the arithmetic mean. Similar to the arithmetic bisection method, the geometric bisection method halves the error of the $\log(a)$ and $\log(b)$ on every step, so linear convergence is guaranteed in a similar manner.

Interestingly, the arithmetic mean halves the absolute error, while the geometric mean halves the relative error.

Q: Should we be using the arithmetic or geometric (or possibly other) mean when using bisection with floats? What are the advantages and disadvantages of each?

1

There are 1 best solutions below

5
On BEST ANSWER

It would seem to be the case, at least as far as I have tested, that the geometric mean is quite useful when $a$ and $b$ differ greatly in magnitude.

Advantages of geometric means:

In double precision, the extreme cases are roughly $10^{\pm308}$. Supposing we are trying to reach $x=2$ to machine precision using these two initial points:

  • arithmetic means would require roughly 1000 iterations.
  • geometric means would require roughly 60 iterations.

This means the worst case scenario for geometric means is far better.

The less extreme scenario (such as with a bracket like $[1,6]$ for $x=2$) has arithmetic means requiring roughly 50 iterations to reach, but the same is true for geometric means as well. This may be justified by noting that the difference of the arithmetic and geometric means

$$\frac{a+b}2-\sqrt{ab}=\frac{(\sqrt a-\sqrt b)^2}2=\frac{(a-b)^2}{2(\sqrt a+\sqrt b)^2}\sim\frac{(a-b)^2}{8x}$$

decays quickly as the interval shrinks.

Disadvantages of geometric means:

  • Some edge case handling becomes necessary (different signs or $0$ is one of the points), meaning more complicated code.
  • May converge slower than expected if one point is very close to $0$ and the other is not (e.g. $[a,b]=[10^{-308},2]$ with a root at $x=1.3$) so that the geometric mean does not appear to be initially closing in on the root as rapidly as the arithmetic mean.
  • Higher arithmetic cost per iteration because one square root (or two to avoid under-/over-flow using $\sqrt x\cdot\sqrt y$) must be computed.

Possible fixes:

  • Handling cases when points are not of the same sign can be done by using the smallest positive float multiplied by the sign of the larger number.
  • A mixture of arithmetic and geometric means should recover initially expected behavior.
    • The arithmetic-geometric mean may be interesting to use.
    • A simpler solution would be to alternate between arithmetic and geometric means.

Update 10/26:

As I've explained here, after one has $x/y\in(0.5,2)$, a swap from geometric mean to arithmetic mean should be used. This conclusion is drawn based on the structure of the double.

Update 11/03:

It should actually make more sense to use $(3x+y)/4$ when the geometric mean fails to significantly reduce the absolute error, where $|x|<|y|$. Intuitively this is roughly equivalent to two iterations of arithmetic means. In the worst case, this may cause one or two extra iterations of arithmetic means when one iteration of bisection would have sufficed. This is particularly of importance in relation to hybrid root-finding methods, where reducing the absolute error more readily improves the interpolation.