Is there an equation similar to square root, but faster for a computer to compute?

5.9k Views Asked by At

I'm making an app that uses a Fermat's spiral to space objects out in an aesthetically pleasing way. This is a change from my first attempt, which used an Archimedean spiral, but I felt that the outer objects became too far apart.

Fermat's spiral uses a square root to calculate the current radius: $r = \sqrt{\theta}$ and $\theta = x$

Square root is a relatively slow algorithm for a computer to calculate for each point, and I don't really need the function to be exactly square root, just to increase slower as it gets larger (preferably non-asymptotically, but could be). When I look for functions "similar to square root" I get nowhere. Remembering high school math, I think

$f(x) = 1-\frac{1}{x+1}$

might work alright.

What are some more functions $f(x)$ that, similarly to square root, increase fast at first, but slow down as $x$ increases?

In case it causes problems, I'm not asking for opinions on which are better, I just want options (or a link to a bunch, if such exist), though help on making my spiral thing better would also be neat, if I'm just barking up the wrong tree for aesthetic fairly-even-distribution around a point.

6

There are 6 best solutions below

4
On BEST ANSWER

This question is better suited for a different stack exchange.. However, I believe fast sqrt cpu instructions are on a lot of chips. For example this one https://stackoverflow.com/questions/7724061/how-slow-how-many-cycles-is-calculating-a-square-root looks like you need 20 cpu cycles for a square root. For $10,000$ objects on screen querying square roots $250$hz would use 5% of 1GHz cycles. And surely you can do better than this, so I really don't agree about your statement "Square root is a relatively slow algorithm for a computer to calculate for each point" unless you are talking many hundreds of thousands of square root calls every frame at 250+fps, in which case, you would already be a specialist developer and you likely wouldn't be posting this question here.

5
On

Depending on how low level the programming language your working with implementing the fast inverse square root algorithm originally developed for Quake 3 could be a solution. Through bit-manipulation and newtons method it finds $\frac{1}{\sqrt{x}}$ then you could take the reciprocal of that value. This would be very fast as the limiting factor is the single division at the end. You would have to be careful about values very near zero however.

2
On

If your goal is to only place objects on the Fermat spiral, you could use the suggestion given by @abiessu in the comment above and use $\theta = t^2$. Then $r = \sqrt \theta = \sqrt{t^2} = \pm t$ and $x = \theta = t^2$.

You can vary the parameter $t$ and get $(r,\theta)$ to place the objects. This only requires squaring and avoids the square root altogther. The tradeoff is that you can only place fewer objects.

5
On

Since it is time critical, you obviously calculate many square roots. If the arguments are similar, then you can take one step of the Newton iteration with the previous square root as the starting value. So given $\sqrt { x - \epsilon}$ you estimate that $\sqrt x$ is the same as a starting value and do one step of the Newton iteration.

5
On

I'm going to assume that your computer uses IEEE 754 double-precision to store numbers. Write a function for converting between a floating-point number and its bit representation. In Python, you can do:

import struct

def get_bits_from_double(x):
    return struct.unpack('=q', struct.pack('=d', x))[0]

def get_double_from_bits(n):
    return struct.unpack('=d', struct.pack('=q', n))[0]

Of course, if you want fast calculations, you'll probably be using C instead of an interpreted language like Python. In C, you can use a union to store two numbers at the same memory location.

#include <stdint.h>

typedef union
{
    uint64_t  bits;
    double    value;
} DOUBLE_un;

inline uint64_t get_bits_from_double(double x)
{
    DOUBLE_un un;
    un.value = x;
    return un.bits;
}

inline double get_double_from_bits(uint64_4 bits)
{
    DOUBLE_un un;
    un.bits = bits;
    return un.value;
}

Or something like that; I don't have a C compiler handy, so I haven't actually tested it.

Now, recall that a double is stored as a three-part bitfield, in the following order:

  1. sign (1 bit, 0 = positive / 1 = negative)
  2. exponent (11 bits)
  3. significand (52 bits)

Since the square root of a negative number isn't real, I'm going to assume that you're only going to pass positive numbers to the function. Thus, the sign bit will always be zero, and the exponent field will dominate the number. So the bit pattern is like an approximate logarithm. We want to approximate a square root, so let's see what happens when we cut the bit pattern in half.

double halve_bits(double x)
{
    uint64_t bits = get_bits_from_double(x)
    bits = bits / 2;
    return get_double_from_bits(bits)
}

If you evaluate the expression halve_bits(x) / sqrt(x) for various numbers, you'll get a very tiny number on the order of $10^{-154}$. This is because we neglected to count for the bias in the exponent field. We could adjust the numbers by multiplying them by 1e154. But you wanted a fast function, so let's try to apply strength reduction by replacing a floating-point multiplication by an integer addition. Work on the bit pattern directly. And of course, write in C for speed.

double approx_sqrt(double x)
{
    uint64_t bits = get_bits_from_double(x)
    bits = bits / 2 + 2303426388484757850;
    return get_double_from_bits(bits)
}

The magic number on the penultimate line is the bit representation of the number $1.0914553763271334 \times 10^{-154}$. YMMV depending on exactly how you're calculating the error. But AFAICT, this function has a maximum relative error of 3.5%. If you need a more accurate square root, you can use the output of approx_sqrt as the initial guess for an iterative algorithm like Newton's method. But since you explicitly want a rough approximation, this will be fine.

So there you go: A fast approximate square root algorithm done entirely with integer math instead of floating-point. But you'll want to do some timing tests to make sure it's actually faster than the standard sqrt function.

0
On

To expand on the answer @Aidan R.S. gave, if you want to generate a list of sequential square roots you can do it iteratively, and save time by using the inverse square root algorithm. By defining $y(x)=\frac{1}{x}$ and using the expansion $y(x+\varepsilon)\approx y(x)-\frac{1}{2}\varepsilon y(x)^3 +\frac{3}{4}\varepsilon^2 y(x)^5$, then substituting $\varepsilon=1$ and using $\sqrt{x}=x\frac{1}{\sqrt{x}}$, we can generate a sequence of square roots without using any division or square roots:

N=10000

y=2
h=y**(-0.5)

y_list=[1,y]
x_list=[1,y*h]

for i in range(N):
    y+=1
    h=h-0.5*h**3+0.75*h**5
    x=h*y

    x_list.append(x)
    y_list.append(y)

plt.plot(y_list,x_list)
plt.plot(y_list,[y**0.5 for y in y_list])

This is more than accurate enough for visual purposes