Partition a positive integer sequence into subsequencies of equal weight

86 Views Asked by At

For a finite sequence of $N$ positive integers $a_1, a_2,.., a_N$ let us define its weight as $w (\{a_i\}) = \log(N) \cdot \sum_{1}^{N}{a_i}$.

I want to partition such sequence into $K$ non-empty subsequencies
$\{a_1, a_2,.., a_{j1}\}, \{a_{j1+1}, a_{j1 + 2},.., a_{j2}\},.., \{a_{j(K-1)+1}, a_{j(K-1)+2},.., a_{N}\}$
in a way that their weights are close to equal.
Or, more formally, I want to minimize maximum weight in resulting subsequences: $\max w (\{a_i\}_j) \rightarrow \min$.

I need a solution that can be calculated in $O(N \log N)$ time.
I don't need a precise solution, numerical and approximate one will suffice.

2

There are 2 best solutions below

5
On

Well the $\log N$ in the formula doesn't really affect our strategy. In general even if you had $w(a)=\sum_{i}a_if(i)$ with $f(i) >0$ the strategy would be the same, just with doing the following for the array $a_i' = a_i f(i)$. Keep in mind that the details and implementations are different depending on how you treat the $\log N$ coefficient. We will suppose that it is part of $a_i$.

Rewriting the problem

We wish to parition $a$ in $K$ segments $S_1\dots S_K$ such that $S_1=[1, i_1]=[i_0,i_1], S_2=[i_1+1,i_2],\dots,S_K =[i_{K-1}, i_K]=[i_{K-1}, N]$ such that $\max_{i=1}^Kw(S_i)$ is minimum in $O(N\log N)$. Where $w(T) =\sum_{i\in T}a_i$.

Solution: Binary search on the maximum weight. In that way you can find the minimum maximum weight, which is what you want to find. It suffices to be able to check if $a$ can be partitioned in $K$ continuous subsequences each having weight smaller than $m\in\mathbb{R}$, i.e find if the minimum maximum is not greater than $m$, in $O(N)$.

Finding if the minimum maximum is not greater than $m$

This is a very simple problem which can be solved greedily in $O(N)$.

If there is an element $a_i > m$ it can't happen. Now suppose $a_i\leq m$, $\forall i\in[N]$. Also notice that, since we must have $K\leq N$ and $a_i > 0$, if we can partition tha array in $K'\leq K$ parts such that their maximum is not greater than $m$, we can also parition them in $K$ such parts (just start seperating elements from the $K'$ partitions).

Let's say you have partitioned $a$ in segments $[1,i_1],[i_1+1,i_2],\dots[i_{K-1}+1, i_K]$ named $S_1,\dots S_K$. If $w(S_1) + w(\{a_{i_1 +1}\})\leq m$ then $S_1\cup \{a_{i_1+1}\}, S_2\setminus \{a_{i_1+1}\}, S_3, \dots ,S_k$ is also a valid solution. This implies that we can try and fit as many elements as we can in each part. If we can fit all the parts in not greater than $K$ segments, then we are done.

We have solved the problem in $O(N\log N)$ time ($O(N)$ for checking each of the $O(\log N)$ values we will have to check).

0
On

Let $t$ be a threshold. I'll show how to test whether or not there exists a solution, where all subsequences have weight $\le t$, in $O(N)$ time.

Now use binary search on $t$, to find the smallest $t$ such that the answer is "yes there exists a solution". Binary search takes logarithmically many iterations, and each iteration takes $O(N)$ time. Assuming the sum of the $a_i$'s is not too large (specifically, polynomial in $N$), the whole procedure can finish in $O(N \log N)$ time, or something similar.

So how do you test whether there exists a solution where subsequences have weight $\le t$? Use a greedy algorithm. Add as many items as possible to the first subsequence, without exceeding weight $t$. Then do the same for the second subsequence, and so on. No need to backtrack. If in the end you find a solution where all subsequences have weight $\le t$, and you need $\le K$ subsequences, then the answer is "yes there exists a solution". Otherwise the answer is "no there does not exist a solution".