Generating a random combination in O(k)?

1.1k Views Asked by At

I need to generate a "fair" random combination of $k$ items chosen from $n$ choices. All the algorithms I've been able to find so far (reservoir sampling, Fisher-Yates shuffle, ...) are of $O(n)$ complexity or worse.

Is there an algorithm of complexity $O(k)$, also?

Unfortunately I don't have a list of my $n$ choices that I can operate on either. I could build one, but that would be an $O(n)$ operation in itself. So, I am looking for an algorithm that just spits out indices into such a list, but doesn't require it.

A quick update:

Here is what I'm currently thinking about: Let's say we're looking at $5$ choose $3$. Then, the possible combinations are:

1 2 3
1 2 4
1 2 5
1 3 4
1 3 5
1 4 5
2 3 4
2 3 5
2 4 5
3 4 5

So, when choosing the first number, I need to pick $1$, $2$ or $3$, with a probability of $\frac{6}{10}$, $\frac{3}{10}$, and $\frac{1}{10}$ respectively. For the second place I then need to choose from either $2$, $3$, and $4$ with probabilities $\frac{3}{6}$, $\frac{2}{6}$, and $\frac{1}{6}$ respectively, or from $3$ and $4$ with probabilities $\frac{2}{3}$ and $\frac{1}{3}$ or I can only to choose 4. And so forth...

So, basically, I "only" need two $O(1)$ functions:

  • One that tells me, given the numbers chosen so far, what's my denominator in the probability fractions (i.e. what's the range of random number I need to pick from). In the examples above, this function would return $10$ for the first number. For the second number, it would return $6$, $3$, or $1$, depending on whether I chose $1$, $2$, or $3$ respectively for the first number, and so on...
  • A function that tells me the number that corresponds to the random "numerator" that I picked. In the example above, the function for the first digit would return $1$ if my random number was between $0$ and $5$ inclusive, $2$ if it was between $6$ and $8$, and $3$ if it was $9$.

I believe that since the numbers are in ascending order, these functions should probably only depend on $n$ minus the last chosen number and are independent of which other numbers were previously chosen. So, at a first rough glance, I don't see any introduction of $O(k)$ or $O(n)$ complexity into these functions, at least not from the parameters they depend on...

Do such functions with $O(1)$ complexity exist?

1

There are 1 best solutions below

2
On BEST ANSWER

wlog, assume that $k \le n/2$. Indeed, if $k > n/2$, then you can instead generate a random combination of $n-k$ items from $n$ items, and inverse it.

Start with an empty set. Repeatedly generate a random number between $1$ and $n$. If this number is not in the set, add it to the set. Repeat until the set contains $k$ elements.

What is the complexity of this algorithm? Since $k \le n/2$, every time you generate a random number, with probability at least half, it will not be in the set. Thus, the expected number of times you generate a number until you insert a new member to the set is $2 = O(1)$. Thus, the expected number of steps is $O(k)$.

Note: "how do you check whether the number is in the set in $O(1)$?": use an external array initially all zeroes. Whenever you add a number, set the corresponding element in the array into 1. After the set contains $k$ elements, reset all those elements of the array back to $0$. This requires using an external array of size O(n), but the same array will be used for any call to the random-generating function, and you still only make $O(k)$ modifications to it.

Alternatively if you don't like array this still works in $O(k \log k)$, using balanced binary tree for the set.