Closed form or formula for the $n$-th binary number with m-bits that has at most k ones

85 Views Asked by At

Like the title suggests, I need the $n$-th number, not the number of numbers that answer that criterion. For example, for $m=4$ and $k=2$ the formula should equal $0,1,2,3,4,5,6,8,9,10,12$ as $n$ increases.

I've tried to find a pattern by sketching the binary numbers, and there seems to be one visually but I could not come up with the formula.

My motivation: I'm trying write a program that iterates over all binary numbers with $n$-bits that has at most $n/2$ ones in it and apply function foo to it.

Given that we're dealing with exponential amount of numbers I want to split the work between multiple processors, and I want each processor to generate its own batch of said numbers.

With such formula I'll be able to give each processor a start index and an end index to maximize efficiency.

1

There are 1 best solutions below

0
On BEST ANSWER

A necessary first step is to know $N(m,k)$: the total number of elements of this finite sequence - call it the $(m,k)$-sequence. This is given by $$N(m,k) = \binom m0 + \binom m1 + \dots + \binom mk,$$ but unfortunately this sum has no closed form. If we're going to be doing many computations with this sequence, it is probably necessary to build up a table of $N(m',k')$ for all $m' \le m$ and $k' \le k$.

With that in mind, let $f_{m,k}(n)$ be the $n^{\text{th}}$ element of the $(m,k)$-sequence. The rule for finding $f_{m,k}(n)$ is as follows:

  1. Of the $N(m,k)$ elements of the sequence, $N(m-1,k)$ begin with a $0$. Thus, if $n \le N(m-1,k)$, then the $n^{\text{th}}$ element begins with a $0$, and we can look it up in the $(m-1,k)$-sequence instead: $f_{m,k}(n) = f_{m-1,k}(n)$.
  2. Otherwise, we skip past the first $N(m-1,k)$ elements, and want the $(n - N(m-1,k))^{\text{th}}$ element that begins with a $1$. The elements of the $(m,k)$-sequence that begin with a $1$ can all be obtained from elements of the $(m-1,k-1)$-sequence by adding a $1$ in the $m^{\text{th}}$ bit. So we take $f_{m-1,k-1}(n - N(m-1,k))$, and add $2^{m-1}$ to it.

After adding a base case, we get the recurrence relation $$ f_{m,k}(n) = \begin{cases} 0 & \text{if }n=1, \\ f_{m-1,k}(n) & \text{if }n \le N(m-1,k), \\ f_{m-1,k-1}(n - N(m-1,k)) + 2^{m-1} & \text{if } N(m-1,k) < n \le N(m,k). \end{cases} $$ The above recurrence assumes $f_{m,k}$ is $1$-indexed. For a $0$-indexed version, take the following, which simply adjusts all the cases by $1$: $$ f_{m,k}(n) = \begin{cases} 0 & \text{if }n=0, \\ f_{m-1,k}(n) & \text{if }n < N(m-1,k), \\ f_{m-1,k-1}(n - N(m-1,k)) + 2^{m-1} & \text{if } N(m-1,k) \le n < N(m,k). \end{cases} $$


As a note on efficiency, if we are dividing labor between processors as in the question, and we want to do an operation on the numbers at indices $a,a+1, \dots, b$ of the $(m,k)$-sequence, it makes sense to use the recurrence above to compute the starting number $f_{m,k}(a)$, but we shouldn't keep using it for the next terms.

Given an element $x$ of the $(m,k)$-sequence, the rule for computing the next element is simply:

  • If $x$ has fewer than $k$ bits set, the next element is $x+1$.
  • If $x$ has exactly $k$ bits set, and there are $z$ trailing $0$ bits at the end of $x$, the next element is $x + 2^z$.