Finding the size of sumsets in limited space

41 Views Asked by At

Let $S=\{a+b:\ a\in A,\ b\in B\}$. I have an explicit representation of $A$ and $B$, but $S$ is too large to store in memory. (For the sake of argument, say $A$ and $B$ are 100 MB and $S$ is 1000 TB.) Is there a reasonably efficient method for finding $|S|$?

The special case $B=\{0,n\}$ for $n\ne0$ is useful: with $A$ sorted, loop through its elements $a$ and check if $na$ is in $A,$ incrementing a counter if not. Then merely add the number of elements in $A$. (This takes $|A|\log|A|$ time and no additional space beyond that needed for the input and output.) But this does not scale easily.

(In fact in my application I am counting $S'=\{ab:\ a\in A,\ b\in B\}$ but these correspond via logarithms.)

1

There are 1 best solutions below

0
On BEST ANSWER

If $A$ and $B$ fit in memory and $A$ is sorted, it is easy enough to enumerate $A+\{b\}$ in increasing order using constant space for each $b\in B$. So you'll have space to construct $|B|$ such generators and lazily merge the streams that come out of them. This will enumerate $A+B$ in increasing order, with repetitions, and it is easy to count the number of distinct elements as they are produced.

This has good $O(|A|+|B|)$ space complexity, but the running time of $O(|A||B| \log |B|)$ is nothing to write home about.