2-wise vs k-wise independence for Count-Min Sketching

Question

2-wise vs k-wise independence for Count-Min Sketching

107 Views Asked by Bumbble Comm At 26 Mar 2026 - 9:12

My question is about a proof shown below (full paper)

If I expand the last line of the proof, i.e., $Pr[\forall_j X_{i,j} > e{\mathbb E}(X_{i,j}) ] < e^{-d}$, then it occurs to me that the r.v.'s $X_{i,j}$ for $j=1,...,d$ must be fully independent, or at least, d-wise independent. However, from the definition of $X_{i,j}$, it occurs to me that since it is an r.v. due to random $h_j$, and since $h_j$s are restricted to be 2-wise independent, the $X_{i,j}$s must be 2-wise independent as well. This is a contradiction. Please help me to understand, what I'm missing here.

Expansion of $Pr[\forall_j X_{i,j} > e{\mathbb E}(X_{i,j}) ] < e^{-d}$:

\begin{align} &\phantom{=}\negmedspace \negmedspace Pr[\forall_j X_{i,j} > e{\mathbb E}(X_{i,j}) ] \\ &= Pr[\cap_{j=1}^{d} \{X_{i,j} > e{\mathbb E}(X_{i,j}) \} ]\\ &=\prod_{j=1}^{d} Pr[ \{X_{i,j} > e{\mathbb E}(X_{i,j}) \} ] \quad \text{by full independence}\\ &\le \prod_{j=1}^{d} \frac{{\mathbb E}(X_{i,j})}{e{\mathbb E}(X_{i,j})} \quad \text{by the Markov inequality}\\ &=e^{-d} \end{align}

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

I was with you up to this point:

However, from the definition of $X_{i,j}$, it occurs to me that since it is an r.v. due to random $h_j$, and since $h_j$s are restricted to be 2-wise independent, the $X_{i,j}$s must be 2-wise independent as well.

It turns out that the $X_{i,j}$’s here are actually fully independent of one another, even though the $h_j$ hash functions are only 2-independent.

Intuitively, the reason for this is that you can think of a count-min sketch as consisting of $d$ independent copies of a simple estimator. Each copy gets its own hash function $h_j$, which is sampled independently of all the other hash functions from a 2-independent family of hash functions. As a result, the behavior of $h_j$ for row $j$ is independent of the behavior of the hash function $h_{j’}$ for any other row $j’$.

Mathematically, what’s going on here is that 2-independence says that for any fixed $j$ and any two different values $i$ and $k$ that $h_j(i)$ and $h_j(k)$ are independent of one another and uniformly-distributed across their codomains. However, notice that this assumes you’re using the same value of $j$ here, meaning that this only applies if you’re using the same hash function both times. That’s different than saying that if you use two different hash functions, then you can only hash two elements before risking other outputs not being independent.

2-wise vs k-wise independence for Count-Min Sketching

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in RANDOM-VARIABLES

Related Questions in INDEPENDENCE

Related Questions in HASH-FUNCTION

Trending Questions

Popular # Hahtags

Popular Questions