Lets say I have an array A[x1,x2,x3,...xN] of size N.
for N = 4 , A = {x1,x2,x3,x4}. [1 based index]
Now,I have to tell how many tuples (i,j) are there such that i<=j and cumulative sum(i,j) is divisible by K.
For example, N = 4
A = { 3,2,5,7} and K = 5.
Now , For this case the answer is 3. as, ( 1,2 ) [cum_sum = 5 ,divisible by
K(5)], (3,3)
[cum_sum = 5 ,divisible by K(5)] , (1,3) [ cum_sum = 10 ,divisible by K(5)]
are such tuples.
I can do this in O(N^2) easily.But how to do this in O(N) or O(N*log(N)) ?
Assume the problem intends that all the $x_i$ are integers.
This can be done in order $O(N)$ as follows:
Create a (zero-based) vector $C$ of size $K$, initially all zeros. $C_m$ at stage $m$ of the algorithm represents the number of ways to form a sum of $m \mod K$ using a consecutive sub-sequence of $\{ x_1, \ldots x_m\}$. Thus after stage $N$, $C_0$ will be one more than the answer to the problem (because we will have counted the empty set, which has a sum vacuously equal to zero mod $K$.
Also create a (zero-based) vector $D$ of size $K$, initially all zeros. $D_m$ at stage $m$ of the algorithm represents the number of ways to form a sum of $m \mod K$ using a consecutive sub-sequence of $\{ x_1, \ldots x_m\}$ that ends at $x_m$.
At stage $0$ change $C_0$ to $1$; this reflects the sum of no elements of the set equaling zero.
At each subsequent stage $n$, for $1 \leq n \leq N$, for $i = 0 \ldots K-1$, do $$ D'_{(i + x_n)\mod K}= D_i + [i \equiv x_n {}_{\pmod K}] \\ C_i= C_i + D'_i $$ where the notation $[i \equiv x_n {}_{\pmod K}]$ means $1$ if $x_n$ is congruent to $i$ modulo $K$, and $0$ otherwise.
At this point $D'_i$ is the number of ways to form a sum of $i \pmod K$ using a consecutive sub-sequence of $\{ x_1, \ldots x_n\}$ that ends at $x_n$, because each of the earlier substrings can lead to sumsthat are offest by $x_n$ from their previous sums, and there is one extra way to choose just $x_n$ which sums to itself. So complete the stage by setting up the the stage $n$ values of $D$: $$D = D'$$
After stage $N$, the answer is $C_0 -1$.
The time complexity of this algorithm is $O(K)$ for each stages, and there are $N$ stages, so for fixed $K$, the algorithm executes in $O(N)$ time.