I found this algorithm from the GNU factor utility.
Given a double width dividend $n=n_1B + n_0$ and a single width odd divisor $d$, where $n_1, n_0, d < B=2^w, 2\nmid d$. Then with the precomputed modular inverse $d' = d^{-1}\;\mathrm{mod}\;B$ and a divisor limit $q' = \lfloor\frac{B-1}{d}\rfloor$, the divisibility of $d$ on $n$ can be determined using the following procedure:
- $q_0 := n_0d'\;\mathrm{mod}\;B$
- $n' := \lfloor q_0d/B\rfloor$.
- If $n'>n1$, then $d\nmid n$.
- $q_1:=(n_1-n')d'\;\mathrm{mod}\;B$
- If $q_1$ > $q'$,then $d\nmid n$.
- Otherwise $d\mid n$ and $d/n = q_1B+q_0$
The single width version of this algorithm is discussed in Is there a fast divisibility check for a fixed divisor?, but I found it still quite hard to understand this algorithm after reading it.
How to prove this algorithm? And does this algorithm work for multi-precision integers ($n=n_0 +n_1B+n_2B^2_...$)? Thanks!
Think of this as long division of a two-digit number in base $B$ by a one-digit number. The algorithm is trying to find $q_1B + q_0$ such that $$n_1B + n_0 = d(q_1B + q_0) = (dq_1)B + dq_0$$
Now it is certainly possible that $dq_0 \ge B$. But since both $d, q_0 < B$, it must be $< B^2$. so we can write $dq_0 = p_1B + p_0$ for some $p_1, p_0 < B$. Thus $$n_1B + n_0 = (dq_1)B + p_1B + p_0 = (dq_1+p_1)B + p_0$$ Since $0 \le n_0 < B$ and $0 \le p_0 < B$, this can only be if $p_0 = n_0$. This means that $$dq_0 \equiv p_0 = n_0 \mod B$$ multiplying by $d'$, $$d'n_0 \equiv d'dq_0 \equiv 1q_0 = q_0 \mod B$$
Thus $q_0 = (d'n_0 \bmod B)$. Now also $$n_1 = dq_1 + p_1$$ and $$p_1 = \frac{dq_0 - n_0}B = \left\lfloor \frac{dq_0}B\right\rfloor = n'$$
And so $$dq_1 = n_1 - n'$$ we can multiply by $d'$ again and working modulo $B$, we get $$q_1 = d'(n_1 - n')\bmod B$$ as a candidate value for $q_1$. But all this formula guarantees is that $q_1d \equiv n_1 - n'\mod B$, and we need $q_1d = n_1 - n'$.
If $n_1 - n' = dq_1$, then
$$n_1 = dq_1 + n' = dq_1 + p_1$$ $$\begin{align}n=n_1B + n_0 &= (dq_1 + p_1)B + n_0 \\&= (dq_1 + p_1)B + p_0 \\&= dq_1B + (p_1B + p_0) \\&= dq_1B + dq_0\\&=d(q_1B + q_0)\end{align}$$ $$n = d(q_1B + q_0)$$ So $d\mid n$ and the quotient is $q_1B + q_0$.
And yes, this algorithm can be extended to higher numbers of "digits", though it gets a bit more complicated.