Better algorithm to find numerical value of one over squareroot

98 Views Asked by At

Description: To find the value of $1/\sqrt{a}$, basically we find the root $x^{\star}$ of the function $f(x)$

$$ f(x) = x^2 - \dfrac{1}{a} $$

One of the ways to find it is by using Newton's method and iterating from an initial value $x_0$.

$$ x_{k+1} = x_{k} - \dfrac{f(x_{k})}{f'(x_k)} = \dfrac{1}{2}\left(x_k + \dfrac{1}{ax_k}\right) $$ $$ \lim_{k\to \infty}x_{k} = \dfrac{1}{\sqrt{a}} $$

This is known as Newton's method which has quadratic convergence, for some positive constant $M$:

$$ \left|x_{k+1} - \dfrac{1}{\sqrt{a}}\right| \le M \cdot \left|x_{k} - \dfrac{1}{\sqrt{a}}\right|^2 $$

Question: Is there a better numerical method, which convergence is higher than Newton's method, to find $x^{\star} = \dfrac{1}{\sqrt{a}}$?

EDIT: I thought about using Taylor expansion until second order, once Newton's method is made using first-order expansion:

$$ f(x) = f(x_k) + (x-x_k) \cdot f'(x_k) + \dfrac{(x-x_k)^2}{2} \cdot f''(x_k) + \dfrac{(x-x_k)^3}{6} \cdot f'''(\varepsilon(x)) $$

And then find the value of $x_{k+1}$ that

$$0 = f(x_k) + (x_{k+1} - x_k) \cdot f'(x_k)+ \dfrac{(x_{k+1}-x_k)^2}{2} \cdot f''(x_k)$$

Using the function $f(x) = x - \dfrac{1}{ax}$ we rewrite

$$x_{k+1}^2 - \left(3+ax_k^2\right) \cdot x_k \cdot x_{k+1} + 3x_k^2 = 0$$

Which solution is

$$x_{k+1} = \dfrac{x_k}{2}\left(3ax_k^2 - \sqrt{\left(3+ax_{k}^2\right)^2 - 12}\right)$$

As there's a square-root, it may be hard to compute and then we make a second-order Taylor approximation around $\frac{1}{\sqrt{a}}$ of this square-root:

$$\sqrt{\left(3+ax^2\right)^2 - 12}\approx 6x\sqrt{a} - ax^2 -3$$

Once $\sqrt{a} \approx \dfrac{1}{2}\left(y+\dfrac{a}{y}\right) = \dfrac{1}{2}\left(\dfrac{1}{x} + ax\right)$ then

$$\sqrt{\left(3+ax^2\right)^2 - 12}\approx 6x \cdot \dfrac{1}{2}\left(\dfrac{1}{x} + ax\right) -ax^2 - 3 = 2ax^2$$ $$\boxed{x_{k+1} = \dfrac{x_k(3-ax_k^2)}{2}}$$

Which is a simple function of polynomials.

But I have no idea about the convergence of using this iteration function. Its order would be $3$ cause we used second-order taylor expansion of $f$, but using linear approximations of the square root around $1/\sqrt{a}$ seems it's bellow $3$.

  1. Then, how about other functions like $f=x^4-a^{-2}$ and using the same tricks?

Motivation: There's an algorithm known as Fast Inverse Square Root (more details here) which starts from a good estimative (using logarithm and bit manipulation), but it uses Newton's iteration to refine the value at final step.

1

There are 1 best solutions below

0
On BEST ANSWER

Householder's method of order $d$ to solve $f(x) = 0$ gives

$$x \to H_d(x) = x + d \frac {\left(1/f\right)^{(d-1)}(x)}{\left(1/f\right)^{(d)}(x)}$$

with rate of convergence $d + 1$.

Using the (wx)Maxima computer algebra system, for

$$f(x) = x^{-2} - a$$

I get

$$\begin{aligned} H_1(x) &= \frac{(3 - a x^2)x}{2} \\ H_2(x) &= \frac{(3 + a x^2)x}{3 a x^2 + 1} \\ H_3(x) &= \frac{a^2x^4 + 6ax^2 + 1}{4ax(ax^2+1)} \\ H_4(x) &= \frac{x(a^2x^4 + 10ax^2 + 5)}{5a^2x^4 + 10ax^2 + 1} \\ \end{aligned}$$

Suppose the target precision is $N$ accurate digits. The last iteration giving $N$ accurate digits needs $N/(d+1)$ accurate digits as input, so there are around $\log_{d+1} N$ iteration steps needed.

Assume the cost of $D$-digit digit multiplication (and hence division) is given by some function $M(D)$ (for example $M(D) = D^{\log_2 3}$ when using Karatsuba algorithm), and assume that multiplication and division by small integers, and addition and subtraction of $D$-digit numbers, are insignificant in comparison. Then the cost depends on whether the number of digits of $a$ is small or large. Many calculations can be reused/shared to reduce number of multiplications.

For small $a$:

$$\begin{aligned} \operatorname{cost}(H_1) = 2 (M(N/1) + M(N/2) + M(N/4) + \cdots + M(1)) &< \frac{2}{\log 2} M(N) \log(N) \\ \operatorname{cost}(H_2) = 3 (M(N/1) + M(N/3) + M(N/9) + \cdots + M(1)) &< \frac{3}{\log 3} M(N)\log(N) \\ \operatorname{cost}(H_3) = 4 (M(N/1) + M(N/4) + M(N/16) + \cdots + M(1)) &< \frac{4}{\log 4} M(N) \log(N) \\ \operatorname{cost}(H_4) = 4 (M(N/1) + M(N/5) + M(N/25) + \cdots + M(1)) &< \frac{4}{\log 5} M(N) \log(N) \\ \end{aligned}$$

showing that higher orders improve overall cost, at least when $N$ is large enough that additions etc really are insignificant.

For $a$ with $N$ digits:

$$\begin{aligned} \operatorname{cost}(H_1) = 3 (M(N/1) + M(N/2) + M(N/4) + \cdots + M(1)) &< \frac{3}{\log 2} M(N) \log(N) \\ \operatorname{cost}(H_2) = 4 (M(N/1) + M(N/3) + M(N/9) + \cdots + M(1)) &< \frac{4}{\log 3} M(N) \log(N) \\ \operatorname{cost}(H_3) = 5 (M(N/1) + M(N/4) + M(N/16) + \cdots + M(1)) &< \frac{5}{\log 4} M(N) \log(N) \\ \operatorname{cost}(H_4) = 5 (M(N/1) + M(N/5) + M(N/25) + \cdots + M(1)) &< \frac{5}{\log 5} M(N) \log(N) \\ \end{aligned}$$

Reference: https://en.wikipedia.org/wiki/Householder%27s_method

(wx)Maxima source code:

f(x) := x^(-2) - a;
H(d, x) := factor(ratsimp(x + d * (diff(1/f(x), x, d - 1)) / (diff(1/f(x), x, d))));
H(1, x);
H(2, x);
H(3, x);