Hanek payne trigonometric range reduction detailed analysis of loss of significance.

1k Views Asked by At

I'm struggling to understand what is the actual problem described as "loss of significance" when a trigonometric range reduction argument is performed. I was trying to perform the analysis by myself but I'm not able to spot the problem. I know what the loss of significance is in floating point computation in general, but I don't understand in what terms is described in the algorithm I'm mentioning.

Taken from the paper "Radian Reduction for Trigonometric Function" (Hanek Payne Algorithm)

  1. Even if M is large enough to guarantee (p+K) valid bits, h' may have so many leading zero bits that there may be fewer than the (p+K) valid bits desired for the normalized equivalent argument.
  2. Similarly h' may have leading bits consisting entirely of ones, and if (1-h) is required for the function evaluation, (1-h') may contain fewer than the desired (p+K) valid bits.

Some notation... more or less from the paper

For any real number $x$ the range reduction consist in computing something like $$ \frac{2^i}{2\pi}x \; \text{mod} \; 2^i = I + h $$ where $i \in {0,1,2,3}$, $I$ is an integer that would allow to select the quadrant and finally $h \in [0,1)$ (real number in that interval). Actually the reduction consist in using $I$ as selector bits, and $h$ to compute an appropriate approximation of $sin$ or $cos$ in the sector indexed by $I$ using the argument $\frac{2\pi}{2^i} h$ as argument. For machine purpose it is assumed that

$$x = 2^k f ,\; \frac{1}{2} \leq f < 1$$

I.e. a floating point number (different from the IEEE standard, but the analysis would be similar), the number $f$ is represented using $p$ digits.

The authors of such paper later define $L$ as the largest integer such that the product of 2 numbers represented using $L$ digits are representable in a register. They also point out that the number $\frac{2}{\pi}$ has to be stored with an appropriate number of digits (virtually infinite). Again we can then write

$$ \begin{array}{l} x = 2^k \sum_{j=1}^{N-1} F_j 2^{-jL} \\ \frac{1}{2\pi} = D_0 + D_1 + D_2 \\ D_0 = \sum_{j=1}^{M} g_j 2^{-jL} \\ D_1 = g_{M+1}2^{-(M+1)L} \\ D_2 = \sum_{j=M+2}^{+\infty} g_j 2^{-jL} \end{array} $$

where $F_j$ is the $j$-th chunk of $L$ bits representing the mantissa $f$ and $g_j$ is the $j$-th chunk representing $\frac{1}{2\pi}$. What happens is that basically

$$ \frac{2^i}{2\pi}x \; \text{mod} \; 2^i \approx 2^i x D_0 \text{mod} \; 2^i $$

is being computed. In general we are interested in computing $h$ with $p+K$ digits, $p$ is the input precision while $K$ is a number of guard digits. What I don't really understand is why the points 7 and 8 I've highlighted are a problem for the reduction. I was trying to make an accurate analysis of the problem by myself but I'm not able to spot the problem. Can anyone explain to me?