Use Bernoulli trials to select random k-length permutations

90 Views Asked by At

Using only Bernoulli trials with $p=0.5$, how can I efficiently create a permutation of length $k$ from a set of length $n$?

I can think of a few ways to do this that waste trials, but I’d like to do so efficiently (using every trial). For example, the best method I can think of is (using the Fisher-Yates shuffle):

If $n=5$

For $i$ in $[0,k-1]$:

  1. Conduct $ceil(log_2(n))$ trials.

$1,1,0$

  1. Convert the series of trials into a binary number, $q$.

$q=b110=6$

  1. Perform the Fisher-Yates step to find the next term in the permutation using the swap index: $q \mod (n-i)$.

Intuitively, this feels wasteful by at most 2 times the number of trials required. Thanks in advance!

2

There are 2 best solutions below

5
On BEST ANSWER

You can generate efficiently (with negligible loss of entropy, asymptotically) samples from a $\ell$-sided die using fair coin tosses: See [1], [2]. These samples can then be used in the Fisher-Yates shuffle.

To describe a simple procedure, pick a large $m$ and find $n$ such that $2^{n-1}< \ell^m \le 2^n$. Pick $n$ bits, yielding a uniform $X$ in $\{0,1,\ldots, 2^n-1\}$ and reject $X$ if $X \ge \ell^m$; otherwise, $X$ yields a sample of $m$ independent uniform variates in $\{0,1,\ldots, j-1\}$; only one bit of entropy is lost for each such sample.

[1] Knuth, Donald E., and Andrew C. Yao. "The complexity of nonuniform random number generation. Algorithms and complexity (Proc. Sympos., Carnegie-Mellon Univ., Pittsburgh, Pa., 1976), 357–428." (1976): 357-428.

0
On

You can do a procedure akin to arithmetic coding. Let's say $n=7$ and $k=3$. Then we take the segment $[0,1]$, divide into in $7$ equals sub-intervals (each one correspond to the first chosen object). Then each one is again divided into $6$ elements, and again into $5$. We have then a division of the unit interval into $7\times 6 \times 5$ small intervals, regarded as the leaf of a tree.

Then we start doing binary trials, which we imagine as the succesive fractional binary digits of a real number in $[0,1]$. For example, if the first two trials give $(1,0)$, we write that as $0.10*****$ , the binary expansion of a real number (the asterisks are still considered as unknown). Then, after two trials we are inside an interval $[2/4, 3/4]$. We continue until our "trial" interval falls inside one of the intervals of the tree, which correspond to the chosen permutation.

It can be shown that this procedure wastes no more than one bit (i.e., one trial) from the ideal, which would be $\log_2(7 \cdot 6 \cdot5) = 7.714$

Here's a Python code

import random

def precomputeTree(n,k):
    tree = [ (0,1,[])]
    for kk in range(n,n-k,-1):
        tree = divideTree(tree, kk)
    return tree

# each element has (r1, r2, [path])
def divideTree(tree,parts):
    t2 = []
    for x in tree:
        r1 = x[0]
        dt = (x[1]-r1)/parts
        path = x[2]
        for j in range(parts):
            r2 = r1+dt
            t2.append((r1, r1+dt, [*path,j]))
            r1 = r2
    return t2

def genShuffle(tree):
    r1=0
    r2=1
    i1=0
    i2=len(tree)-1
    dt = 1
    cont = 0
    while i1 != i2:
        randb = random.randint(0,1)
        dt /= 2
        cont +=1
        # print(f"{i1=} {i2=} {r1=} {r2=} {dt=} {randb=}")
        if randb == 1:
            r1 += dt
            while tree[i1][1] < r1:
                i1 += 1
        else:
            r2 -= dt
            while tree[i2][0] >= r2:
                i2 -= 1
    return tree[i1][2], cont

tree = precomputeTree(7,5)
#print(tree)
for t in range(10): # some trials
    print(genShuffle(tree))