I have an application where I need to run $\cos(s)$ (and $\operatorname{sinc}(s) = \sin(s)/s$) a large number of times and is measured to be a bottleneck in my application.
I don't need every last digit of accuracy: $10^{-10}$ over an input range of $[-15^\circ < s < 15^{\circ}]$ should be sufficient (This is the limit of the input sensor data).
I have implemented a simple Taylor approximation, but would like to ask:
- (a) Is there a more efficient approximation?
(b) If Taylor is the most efficient, is there a better way to implement it?
// Equation Coefficients double s_2 = 0.25 * omega_ebe.squaredNorm(); double s_4 = s_2 * s_2; double s_6 = s_4 * s_2; double s_8 = s_6 * s_2; double s_10 = s_8 * s_2; double cos_coef = 1.0 - s_2 / 2.0 + s_4 / 24.0 - s_6 / 720.0 + s_8 / 40320.0 - s_10 / 3628800.0; double sinc_coef = 0.5 - s_2 / 12.0 + s_4 / 240.0 - s_6 / 10080.0 + s_8 / 725760.0 - s_10 / 79833600.0;
EDIT: I haven't forgotten to select an answer! I'm going to code a few of the up and run them on target (an embedded PowerPC and an embedded ARM) to see how they perform.


This is probably more a question for StackOverflow because how efficient algorithms are depends on the processor architecture, you have to value tables against direct calculations and programmers are really good at that.
It is not surprising, then, that there is already a question that handles your problem quite well: https://stackoverflow.com/questions/2088194/fast-sin-cos-using-a-pre-computed-translation-array. Most solutions don't yield their accuracy but you can check it afterwards and raise the accuracy if it is not high enough.