Lower Bound Space complexity one pass algorithm / Heavy-Hitters Problem

93 Views Asked by At

I am confronted with the following problem:

Let S be the family of all m-subsets of $[n] = [2m]$ let $S_1, S_2 \in S$ be distinct sets and let the state of storage be $\text{State}_1$ after stream $S_1$ is processed(resp. $\text{State}_2$)

Show that $\text{State}_1 \ne \text{State}_2.$

Background:

  • The algorithm that processes this is deterministic and that is able to solve the problem in one pass requires $\Omega(m)$ bits of storage.

  • We are trying to solve the HH-Problem which is basically finding all elements in a stream that appear with a certain frequency.

  • $[n]$ is defined as ${\{1,\ldots,n}\}:n\in N$

  • length of each subset is m from taken from 2m

My Ideas so far:

  • There have to be at least $\begin{pmatrix}2m \\ m \end{pmatrix}$ possible storage states
  • I think you can proof this with contradiction eg. show that if $\text{State}_1 = \text{State}_2$ another condition fails but I don't know how
  • My intuition tells me that if $S_1$ and $S_2$ are distinct that the storage state can't be equal but I don't know how to show that.