Computing $\sum_{k=1}^n (a^k \bmod m)$

239 Views Asked by At

I would like to find a closed form solution for

$$\sum_{k=1}^n (a^k \bmod m)$$ $$0<a<m, n > 0$$

Note that the mod operator is within the brackets.

If a closed form solution does not exist, I'm interested in good approximations, and how I should have gone about concluding that a closed form solution can't be found.

Solutions that require restrictions on $a$ and $m$ are of interest too.

Background

$a^k \bmod m$ is a very simple hashing function. It gives the $k$th output of the simplified linear congruential generator PRNG $X_{k+1} = (a X_k \bmod m)$. Large values of $a$ and $m$ are typically seen. A common example is $a=16807, m=2^{31}-1$. I'm looking for a way to sum the first $n$ pseudorandom outputs. My only requirement is that it be visually random, so values of $a$ and $m$ (and even small modifications to the formula) are flexible. An ideal solution would work for any $a$ and $m$, but that is not essential.

What I've tried

There's a very obvious basic linear approximation of a sum of evenly distributed random values given by $k$ times the mean value. This isn't good enough for my case as I need to capture the visually random nature of the values.

The main trick that I know for dealing with mod is to isolate the behaviour of a single period and then sum it over each. I'm struggling to apply that here for this nonlinear case.

I started by tackling some simplified similar problems. If we remove the mod operator, this is just a geometric series which is obviously trivial. I tried tackling the simpler $\sum_{k=1}^n(k \bmod m)$ and was able to get a result by isolating the period:

$$\left\lfloor \frac a n \right\rfloor \sum_{i=1}^{m-1}i + \sum_{i=1}^{n \bmod m}i$$

By a similar process, I thought it might be possible to isolate the geometric series and sum over the periods. While $a^k < m$ the geometric series gives the answer. This can be written as

$$\sum_{k=1}^{\lfloor\log_a m \rfloor} a^k$$

I don't know how to take this further. I tried using $\lfloor\log_a jm\rfloor$ to construct a sum of sums along the following lines, but that introduces awkward floor functions as limits and I don't know what to do with that at all.

$$\sum_{j=1}^l {\sum_{k=\lfloor\log_a jm\rfloor}^{\lfloor\log_a (j+1)m\rfloor}{a^k}}$$

I think those limits are a little bit wrong actually, but I'm not worried about nailing them down unless I know how to handle their form.


I also tried a different approach, by rewriting the mod function using floor.

$$\sum_{k=1}^n (a^k - m \left\lfloor \frac{a^k}{m} \right\rfloor)$$

$$\sum_{k=1}^n a^k - m\sum_{k=1}^n \left\lfloor \frac{a^k}{m} \right\rfloor$$

From here I just face the same difficulties with the floor part as I did above. It feels like there may not be a closed form solution available, but I don't know how I would go about realising that and knowing to stop looking.

1

There are 1 best solutions below

2
On BEST ANSWER
$ \newcommand{\ord}{\mathop{\rm ord}\nolimits} $

As a technical note about notation: $(x \ \mathrm{mod}\ m)$ is an infinite set of integers and sums of infinite sets are more infinite sets. You could unambiguously indicate what you want by $$ \sum_{k=1}^n f(a^k \ \mathrm{mod}\ m) \text{,} $$ where $f$ takes a residue class $x \ \mathrm{mod}\ m$ to its least non-negative member. (Note that other choices are possible. A not entirely rare choice is the member with least magnitude. Note that this choice has an ambiguity when the modulus is even, so a full specification requires resolving whether to produce the positive or negative member when both have the same magnitude.)

How do we compute $$ s(a,m,n) = \sum_{k=1}^n f(a^k \ \mathrm{mod}\ m) \text{?} $$ Since we can reduce the modular arithmetic if $\gcd(a,b) > 1$, let us assume $\gcd(a,m) = 1$. (If $\gcd(a,m) = b > 1$, then we should instead study $s(a/b, m/b,n)$.) Let $\phi$ be the Euler totient function and $\ord_m a$ be the multiplicative order of $a$ modulo $m$. By Euler's theorem, $\ord_m a$ divides $\phi(m)$, so as long as we can factor $\phi(m)$, a binary search can find the multiplicative order of $a$ modulo $m$.

The powers of $a$ modulo $m$ are periodic, with period $\ord_m a$. So we apply the division algorithm to determine how many whole periods and how much is left when we partition $n$ into pieces of size $\ord_m a$. In particular, let $q$ and $r$ be such that $q \geq 0$, $ 0 \leq r < \ord_m a$ and $$ n = q \ord_m a + r \text{.} $$ We find that $n$ is $q$ full periods of the powers of $a$ with a remainder of $r$ more powers of $a$.

Let $$ p(a,m) = \sum_{k=1}^{\ord_m a} f(a^k \ \mathrm{mod}\ m) \text{.} $$ Then $$ s(a,m,n) = q \, p(a,m) + \sum_{k=1}^r f(a^k \ \mathrm{mod}\ m) \text{.} $$

Examples (timings appear to the right): $\ord_{2^{31}-1}16\,807 = 2\,147\,483\,646$. \begin{align*} n& & s(16\,807, n, 2^{31} - 1)& \\ 1& & 16\,807& \\ 2& & 282\,492\,056& \\ 3& & 1\,905\,142\,129& \\ 4& & 2\,890\,085\,787& \\ 5& & 4\,034\,194\,717& \\ 10& & 9\,529\,300\,043& \\ 15& & 13\,441\,838\,169& \\ 20& & 17\,004\,096\,180& \\ 40& & 42\,026\,616\,212& \\ 60& & 64\,128\,301\,397& \\ 80& & 87\,098\,290\,857& \\ 100& & 111\,330\,854\,817& \\ 10^5& & 107\,435\,233\,385\,977& & (0.1 \,\mathrm{s})& \\ 10^6& & 1\,073\,806\,376\,451\,147& & (1.2 \,\mathrm{s})& \\ 10^7& & 10\,737\,818\,730\,605\,039& & (14 \,\mathrm{s})& \end{align*} $\ord_{2^{31}-1}2 = 31$.
\begin{align*} n& & s(2, n, 2^{31} - 1)& \\ 1& & 2& \\ 2& & 6& \\ 3& & 14& \\ 4& & 30& \\ 5& & 62& \\ 10& & 2046& \\ 15& & 65534& \\ 20& & 2\,097\,150& \\ 40& & 2\,147\,484\,669& \\ 60& & 3\,221\,225\,469& \\ 80& & 4\,295\,491\,580& \\ 100& & 6\,442\,451\,195& \\ 10^5& & 6\,925\,701\,870\,437& & (70 \,\mu\mathrm{s})& \\ 10^6& & 69\,273\,527\,484\,932& & (57 \,\mu\mathrm{s})& \\ 10^7& & 692\,735\,276\,946\,410& & (65 \,\mu\mathrm{s})& \\ 10^8& & 6\,927\,365\,633\,427\,248& & (62 \,\mu\mathrm{s})& \\ 10^9& & 69\,273\,664\,924\,010\,478& & (63 \,\mu\mathrm{s})& \end{align*} Note that, as a matter of coding,

  • If $\ord_m a$ is not too large, you should probably cache $p(a,m)$ and the table of values of $\sum_{k=1}^r f(a^k \ \mathrm{mod}\ m)$ for $r =0, 1, \dots, (\ord_m a)-1$. Then $s(a,m,n)$ is a quotient and remainder calculation to get $q,r$ and then $q\,p(a,m)$ is a simple multiplication and $\sum_{k=1}^r f(a^k \ \mathrm{mod}\ m)$ is a table lookup.
  • If $\ord_m a$ is too large to store a full table of sums for each possible remainder, store a table of $a^{2^k}\ \mathrm{mod}\ m$ for $k = 1, \dots, \lfloor \log_2(\ord_m a) \rfloor$ and use exponentiation by squaring (with operations modulo $m$). (The timings in the table above did not use this method, although $\ord_{2^{31} - 1}16\,807 = 2\,147\,483\,646$ is an excellent candidate, requiring only $31$ squares of $a$ modulo $m$ be stored.)
  • There should be a method, similar to the Fast Fourier Transform to decompose the remainder sum into blocks by which bits are set to $1$ in $k$ and accumulate the powers in $r \log r$ time rather than $r^2$ time, but I have not thought through the details of such a method.