Dividing the interval in subintervals of equal length

3.2k Views Asked by At

I am asked to describe the operation of the processure BUCKET SORT at the array $$A=\langle 0.75, 0.13, 0.16, 0.64, 0.39, 0.20, 0.89, 0.53, 0.71, 0.42, 0.19 \rangle $$

dividing the interval $[0, 1)$ in $10$ subintervals of equal length.

In my book I found the following algorithm:

BUCKETSORT(A) 
 n ← length[A] 
 for i ← 1 to n 
     insert A[i] in the chain B[⌊nA[i]⌋] 
 for i ← o to n-1 
     sort the chain B[i] with insertion sort 
 concatenation of the chains B[0], B[1], ... , B[n-1] 

So, in this case $n=11$, right??

How can we use the fact that we have to divide the interval in $10$ subintervals of equal length??

2

There are 2 best solutions below

4
On

Your ten subintervals are of the form $[(k-1)/10,k/10)$ where $k = 1, 2, \ldots, 10$. It sounds like the question is asking you to determine which points go into which buckets, and then perhaps the overall complexity of the algorithm for this data assuming that you are using insertion sort on each bucket.

2
On

This line of the book's algorithm,

 n ← length[A]

is used in two ways, of which one is not essential to a bucket sort algorithm. It is used in for i ← 1 to n, which determines which elements A[i] are sorted—for this purpose, the number must be length[A]—but it is also used in the expression B[⌊nA[i]⌋], which says that there are n buckets.

The discussion becomes a bit easier if we rewrite the first loop, for i ← 1 to n, as for i ← 1 to length[A], while still using n everywhere else. Then n is just the number of buckets.

"Bucket sort" merely means you choose some number of buckets, which may or may not be a function of the size of the input. Using n for the number of buckets only, you could have n ← length[A] or n ← length[A]/2 or n ← sqrt(length[A]) or even n ← 5.

The choice n ← length[A] satisfies the property that the mean number of elements in each bucket will be $1$. I'm not convinced that that's a particularly good choice in general. In this particular exercise, however, it seems that you are directed to set $n=10$, which means you don't have to decide what the best value of $n$ would be.