Efficient Method to find funsum$(n) \pmod m$ where funsum$(n)= 0^0 + 1^1 + 2^2 + ..... n^n$

290 Views Asked by At

I have a series: $$f(n) = 0^0 + 1^1 + 2^2 + ..... n^n$$ I want to calculate $f(n) \pmod m$ where $n ≤ 10^9$ and $m ≤10^3$. I have tried the approach of the this accepted answer but the complexity is more than acceptance. I have used this approach for calculating modular exponential. However, I am not able to optimize it further. Kindly help. Thanks in advance.

2

There are 2 best solutions below

4
On BEST ANSWER

For an efficient computation, we have to use the well-known fast exponentiation by repeated squaring, based on the simple observations $x^{2k}=(x^k)^2$ and $x^{2k+1}=x^{2k}\cdot x$. There's no need in a recursive function, because it's easy to transform it into an iteration through the binary expansion of the exponent. But that would still give an algorithm of complexity $O(n\log n),$ and that's quite forbidding already for $n=10^8,$ let alone $n=10^9.$ Savings are possible since we need the results only modulo $m$, and that's assumed to be much smaller than $n$.
Obviously, large exponents aren't the problem, so I consider reduction of the exponents modulo $\varphi(m)$ a blind alley: we're summing over all bases up to $n$, many of them having common divisors with $m$ (60% in the obviously allowed case $m=1000$).
But certainly $l=j\pmod m$ implies $l^l=j^l\pmod m$, so we can group our summands accordingly. Let $n=mk+r$ with $0\le r\le m-1.$ Then, $$f(n) = 1+\sum^n_{l=1}l^l=1+\sum^{m-1}_{j=1}j^j\sum^{k-1}_{i=0}j^{mi}+\sum^{r}_{j=1}j^j\cdot j^{mk}\pmod m.$$ With an auxiliary function $$g(x,k)=\sum^{k-1}_{i=0}x^i,$$ we can write this $$f(n) = 1+\sum^{r}_{j=1}j^j\,g(j^m, k+1)+\sum^{m-1}_{j=r+1}j^j\,g(j^m, k)\pmod m.$$ Unfortunately, using the fast formula $$g(x,k)=\frac{x^k-1}{x-1}$$ is not an option, as division modulo a composite $m$ can get very tricky.
But there's an alternative: Obviously, we have $g(x,2k)=(1+x^k)\,g(x,k)$ and $g(x,2k+1)=1+x\,g(x,2k)$, and we don't even have to calculate $x^k,$ since $1+x^k=(x-1)\,g(x,k)+2,$ i.e. $g(x,2k)=((x-1)\,g(x,k)+2)\,g(x,k)$.
As one can see easily, the resulting algorithm has complexity $O(m\log n)$, and that's reasonably fast (milliseconds) for the values of $n,m$ in the question.
The crucial functions are implemented like this (in Java):

private static long f(long n, int mod) {
    long s = 1;
    long k = n / mod;
    long r = n % mod;
    for (int j = 1; j < mod; j++) {
        long x = Power.pow(j, mod, mod);
        s = (s + Power.pow(j, j, mod) * g(x, k + ((j <= r) ? 1 : 0), mod)) % mod;
    }
    return s;
}

private static long g(long x, long k, int mod) {
    if(k == 0) return 0;
    long mask = Long.highestOneBit(k);
    long g = 1;
    while ((mask >>= 1) > 0) {
        g = ((x - 1) * g % mod + 2) * g % mod;
        if ((k & mask) != 0) {
            g = (g * x + 1) % mod;
        }
    }
    return g;
}
22
On

$$\sum_{k=1}^n k^k\bmod m\equiv \sum_{k=1}^n(k\bmod m)^k.$$

You can precompute $k^k\bmod m$ for all $k\in[0,m-1]$ (this takes $O(m\log m)$ modular multiplies by squarings). For the next terms, $(k+pm)^{k+pm}\equiv k^k(k^m)^p$, and it suffices to also keep the values of $k^m$.

The total cost for the summation will be

$$\color{green}{O(m\log m+n)}.$$

For example, with $m=5$, and assuming $0^0=0$,

$$\ \ \ \ \ k^5\to0,1,2,3,4,$$ $$\ \ \ \ \ k^k\to 0,1,4,2,1,\\\ k^{k+5}\to0,1,3,1,4,\\k^{k+10}\to0,1,1,3,1,\\k^{k+15}\to0,1,2,4,4,$$ $$k^{k+20}\to0,1,4,2,1,\\k^{k+25}\to0,1,3,1,4,\\k^{k+30}\to0,1,1,3,1,\\k^{k+35}\to0,1,2,4,4,\\\cdots$$

In this table, the columns are the powers of $k$, which show a period of at most $m-1$ (actually $\phi(m)$, Euler's totient). Hence, blocks of $m\cdot(m-1)$ elements bring a constant contribution.

After computing a complete block by the above method, the complexity lowers to

$$\color{green}{O(m^2)},$$

using $$f(n)=f(n\bmod (m-1)m)+\left\lfloor\frac n{(m-1)m}\right\rfloor b$$ where $b$ is the sum over a whole block

Finally, we can yet speed-up the process by accumulating the columns of a block (be them complete or not) with the geometric summation formula

$$k^k+k^{k+m}+k^{k+2m}+\cdots k^{k+pm}=k^k\frac{k^{m(p+1)}-1}{k^m-1}.$$

After precomputing a table of inverses, the last expression can be computed in time $O(\log m)$, hence the whole process takes

$$\color{green}{O(m\log m)}.$$