Description: To find the value of $1/\sqrt{a}$, basically we find the root $x^{\star}$ of the function $f(x)$
$$ f(x) = x^2 - \dfrac{1}{a} $$
One of the ways to find it is by using Newton's method and iterating from an initial value $x_0$.
$$ x_{k+1} = x_{k} - \dfrac{f(x_{k})}{f'(x_k)} = \dfrac{1}{2}\left(x_k + \dfrac{1}{ax_k}\right) $$ $$ \lim_{k\to \infty}x_{k} = \dfrac{1}{\sqrt{a}} $$
This is known as Newton's method which has quadratic convergence, for some positive constant $M$:
$$ \left|x_{k+1} - \dfrac{1}{\sqrt{a}}\right| \le M \cdot \left|x_{k} - \dfrac{1}{\sqrt{a}}\right|^2 $$
Question: Is there a better numerical method, which convergence is higher than Newton's method, to find $x^{\star} = \dfrac{1}{\sqrt{a}}$?
EDIT: I thought about using Taylor expansion until second order, once Newton's method is made using first-order expansion:
$$ f(x) = f(x_k) + (x-x_k) \cdot f'(x_k) + \dfrac{(x-x_k)^2}{2} \cdot f''(x_k) + \dfrac{(x-x_k)^3}{6} \cdot f'''(\varepsilon(x)) $$
And then find the value of $x_{k+1}$ that
$$0 = f(x_k) + (x_{k+1} - x_k) \cdot f'(x_k)+ \dfrac{(x_{k+1}-x_k)^2}{2} \cdot f''(x_k)$$
Using the function $f(x) = x - \dfrac{1}{ax}$ we rewrite
$$x_{k+1}^2 - \left(3+ax_k^2\right) \cdot x_k \cdot x_{k+1} + 3x_k^2 = 0$$
Which solution is
$$x_{k+1} = \dfrac{x_k}{2}\left(3ax_k^2 - \sqrt{\left(3+ax_{k}^2\right)^2 - 12}\right)$$
As there's a square-root, it may be hard to compute and then we make a second-order Taylor approximation around $\frac{1}{\sqrt{a}}$ of this square-root:
$$\sqrt{\left(3+ax^2\right)^2 - 12}\approx 6x\sqrt{a} - ax^2 -3$$
Once $\sqrt{a} \approx \dfrac{1}{2}\left(y+\dfrac{a}{y}\right) = \dfrac{1}{2}\left(\dfrac{1}{x} + ax\right)$ then
$$\sqrt{\left(3+ax^2\right)^2 - 12}\approx 6x \cdot \dfrac{1}{2}\left(\dfrac{1}{x} + ax\right) -ax^2 - 3 = 2ax^2$$ $$\boxed{x_{k+1} = \dfrac{x_k(3-ax_k^2)}{2}}$$
Which is a simple function of polynomials.
But I have no idea about the convergence of using this iteration function. Its order would be $3$ cause we used second-order taylor expansion of $f$, but using linear approximations of the square root around $1/\sqrt{a}$ seems it's bellow $3$.
- Then, how about other functions like $f=x^4-a^{-2}$ and using the same tricks?
Motivation: There's an algorithm known as Fast Inverse Square Root (more details here) which starts from a good estimative (using logarithm and bit manipulation), but it uses Newton's iteration to refine the value at final step.
Householder's method of order $d$ to solve $f(x) = 0$ gives
$$x \to H_d(x) = x + d \frac {\left(1/f\right)^{(d-1)}(x)}{\left(1/f\right)^{(d)}(x)}$$
with rate of convergence $d + 1$.
Using the (wx)Maxima computer algebra system, for
$$f(x) = x^{-2} - a$$
I get
$$\begin{aligned} H_1(x) &= \frac{(3 - a x^2)x}{2} \\ H_2(x) &= \frac{(3 + a x^2)x}{3 a x^2 + 1} \\ H_3(x) &= \frac{a^2x^4 + 6ax^2 + 1}{4ax(ax^2+1)} \\ H_4(x) &= \frac{x(a^2x^4 + 10ax^2 + 5)}{5a^2x^4 + 10ax^2 + 1} \\ \end{aligned}$$
Suppose the target precision is $N$ accurate digits. The last iteration giving $N$ accurate digits needs $N/(d+1)$ accurate digits as input, so there are around $\log_{d+1} N$ iteration steps needed.
Assume the cost of $D$-digit digit multiplication (and hence division) is given by some function $M(D)$ (for example $M(D) = D^{\log_2 3}$ when using Karatsuba algorithm), and assume that multiplication and division by small integers, and addition and subtraction of $D$-digit numbers, are insignificant in comparison. Then the cost depends on whether the number of digits of $a$ is small or large. Many calculations can be reused/shared to reduce number of multiplications.
For small $a$:
$$\begin{aligned} \operatorname{cost}(H_1) = 2 (M(N/1) + M(N/2) + M(N/4) + \cdots + M(1)) &< \frac{2}{\log 2} M(N) \log(N) \\ \operatorname{cost}(H_2) = 3 (M(N/1) + M(N/3) + M(N/9) + \cdots + M(1)) &< \frac{3}{\log 3} M(N)\log(N) \\ \operatorname{cost}(H_3) = 4 (M(N/1) + M(N/4) + M(N/16) + \cdots + M(1)) &< \frac{4}{\log 4} M(N) \log(N) \\ \operatorname{cost}(H_4) = 4 (M(N/1) + M(N/5) + M(N/25) + \cdots + M(1)) &< \frac{4}{\log 5} M(N) \log(N) \\ \end{aligned}$$
showing that higher orders improve overall cost, at least when $N$ is large enough that additions etc really are insignificant.
For $a$ with $N$ digits:
$$\begin{aligned} \operatorname{cost}(H_1) = 3 (M(N/1) + M(N/2) + M(N/4) + \cdots + M(1)) &< \frac{3}{\log 2} M(N) \log(N) \\ \operatorname{cost}(H_2) = 4 (M(N/1) + M(N/3) + M(N/9) + \cdots + M(1)) &< \frac{4}{\log 3} M(N) \log(N) \\ \operatorname{cost}(H_3) = 5 (M(N/1) + M(N/4) + M(N/16) + \cdots + M(1)) &< \frac{5}{\log 4} M(N) \log(N) \\ \operatorname{cost}(H_4) = 5 (M(N/1) + M(N/5) + M(N/25) + \cdots + M(1)) &< \frac{5}{\log 5} M(N) \log(N) \\ \end{aligned}$$
Reference: https://en.wikipedia.org/wiki/Householder%27s_method
(wx)Maxima source code: