In the paper "Shor's algorithm with fewer (pure) qubits", Zalka points out that a multiplication by an arbitrary constant $C$ modulo $N$ can be reduced in cost by finding smaller values $c_1$, $c_2$ such that $c_1 \cdot c_2^{-1} = C \pmod{N}$. He gives an algorithm for computing these values $c_1$ and $c_2$ by stopping the extended Euclidean algorithm halfway through, such that $c_1$ and $c_2$ have $n/2 + O(1)$ bits whereas $C$ would have had $n = \lg N$ bits. I call breaking up $C$ into half-sized multiplicative pieces in this way pseudo-factoring $C \pmod{N}$.
The benefit of doing two multiplications by constants with $n/2$ bits, instead of $n$ bits, is that the size of the constant determines the additional workspace needed for the multiplication. Using more smaller constants reduces space requirements, without really affecting the time cost (when using schoolbook multiplication). This is relevant to the cost of Shor's algorithm.
I was wondering if there are similarly simple tricks for splitting an unknown constant $C$ into even more smaller pieces. For example, into four quarter-sized pieces (it doesn't matter if none, some, or all are multiplicative inverses) instead of two half-sized pieces. Or maybe all the way down into just factors of 2 and 3, like $C = 2^a 3^b \pmod{N}$ where $a$ and $b$ each have $O(n)$ bits. Or maybe this can be shown to be equivalent to some hard problem? Certainly the $2^a 3^b$ version sounds like it would be as hard as the discrete logarithm.
Here is python code that computes pseudo factors roughly in the way Zalka described:
from typing import Tuple
def pseudo_factor_mod(f: int, N: int) -> Tuple[int, int]:
x, y = 1, 0
u, v = 0, 1
a = f
b = N
pairs = []
while b:
q = a // b
x, u = u, x - q * u
y, v = v, y - q * v
a, b = b, a - q * b
if math.gcd(u, N) == 1:
pairs.append((b, u))
if math.gcd(x, N) == 1:
pairs.append((a, x))
best_pair = min(pairs, key=lambda pair: max(
abs(pair[0]).bit_length(),
abs(pair[1]).bit_length()))
return best_pair