I have several mathematical operations on binary numbers that are special cases of more general arithmetic operations. I am wondering whether there exist more specialized algorithms purpose-made for these edge cases that are faster than the general algorithm. Could someone point me to such algorithms if they exist?
Please note I'm talking specifically about doing math on binary unsigned integers only. I'm not interested in binary floats or negative numbers. Only in binary positive integers.
Im open to algorithms that exploit the GPU's cuda cores as well as general CPU algorithms.
Here are the specific edge cases I'm looking for specialized algorithms for:
Squaring a binary number. Currently I'm using the generic powering algorithm with input power of 2 to square a given binary number. Is there specialized algorithm for this that I can use instead that's faster than general powering?
Taking only the remainder of binary division without being interested in the actual result. Currently if I'm only after the remainder of dividing two binary numbers, I do an entire division, store the result and the remainder and never look at the result again. Is there a (faster) specialized algorithm that only gets you the remainder of division?
Adding 1 to a binary number. Currently I do the literal addition (N+1) using the general binary addition algorithm. (for this one I've found a few, but I'm asking here because I don't know which one is the fastest).