example 1.4 chapter 1 from mining of massive data sets book

Question

example 1.4 chapter 1 from mining of massive data sets book

305 Views Asked by Bumbble Comm At 10 May 2026 - 11:15

I am reading the book mining of massive data sets. (http://mmds.org/) In its chapter 1 http://infolab.stanford.edu/~ullman/mmds/ch1.pdf following section is there on page 9.

Example 1.4: Suppose hash-keys are positive integers. A common and simple hash function is to pick h(x) = x mod B, that is, the remainder when x is divided by B. That choice works ﬁne if our population of hash-keys is all positive integers. 1/Bth of the integers will be assigned to each of the buckets. However, suppose our population is the even integers, and B = 10. Then only buckets 0, 2, 4, 6, and 8 can be the value of h(x), and the hash function is distinctly nonrandom in its behavior. On the other hand, if we picked B = 11, then we would ﬁnd that 1/11th of the even integers get sent to each of the 11 buckets, so the hash function would work very well.

I could not understand this statement

hen only buckets 0, 2, 4, 6, and 8 can

be the value of h(x), and the hash function is distinctly nonrandom in its behavior.

What is the example in this trying to say? The hash in my understanding is made by h(x)= x mod B , so if B=10 then x mod 10 will have all values from 0,1,2,3,4,5,6,7,8,9.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

This is a sentence a few lines above the particular example

To be precise, if hash-keys are drawn randomly from a reasonable population of possible hash-keys, then h will send approximately equal numbers of hash-keys to each of the $B$ buckets.

This property is not satisfied, in particular, odd-numbered bins would not have any entry if $B=10$.

However, if $B$ is $11$, then we have $h(2)=2, h(4)=4, h(6)=6, h(8)=8, h(10)=10, h(12)=1$ and so on. Note that even the odd-bins would be filled uniformly. This phenomenon is due to $2$ and $11$ are coprime.

example 1.4 chapter 1 from mining of massive data sets book

There are 1 best solutions below

Related Questions in DATA-ANALYSIS

Related Questions in DATA-MINING

Trending Questions

Popular # Hahtags

Popular Questions