Longest sequence of 1s in binary representation of a number & Average Sequence length for number with N digits

1k Views Asked by At

I originally posted this to rec.puzzles in October 1992.

I subsequently solved the problem (at least I think I did...) and published an article about it in early 1993.

The article is nowhere to be found and my brain has aged significantly, so if any of you want to have a go at this....

Suppose you're given a word length of N bits. Looking at all the 2^N binary numbers you can represent with this, you're interested in the longest sequence of 1's in every binary number. After you got all this length values, you go on and compute the average.

Example: N = 4

000 -> longest sequence has length 0

001 -> longest sequence has length 1

010 -> longest sequence has length 1

011 -> longest sequence has length 2

100 -> longest sequence has length 1

101 -> longest sequence has length 1

110 -> longest sequence has length 2

111 -> longest sequence has length 3

======================================

SUM 11 / AVERAGE SEQUENCE LENGTH 1.375

It's obviously simple to write a program that computes this ASL ('average sequence length') value for any given N.

I'd be interested in a recurrence relation, i.e. a function

ASL(N) = f(ASL(N-1))

or

ASL(N) = f(N)

I was told that this problem is treated by a paper of 'von Neumann' and has some importance in information coding, but I couldn't find any hints (not even in the FAQ to rec.puzzles ;-)

2

There are 2 best solutions below

0
On

The following recursive framework is a bit more complicated than what you are asking for. Nevertheless, it allows a recursive computation of your ASL function, and maybe the recursion can be simplified.

For $n \geq k \geq l$, let $$A(n,k,l)$$ be the number of sequences of length $n$ such that the longest subsequence has length $k$ and there are exactly $l$ digits 1 at the end of the sequence. Then the number of sequences of length $n$ with the a longest subsequence of length $k$ is $$A(n,k) = \sum_{l = 0}^k A(n,k,l)$$ and the average longest subsequence length among sequences of length $n$ is $$\operatorname{ASL}(n) = \frac{1}{2^n} \cdot \sum_{k = 0}^n k A(n,k).$$

For the numbers $A(n,k,l)$, there is the recurrence formula \begin{align*} A(0,0,0) & = 1, \\ A(n+1,k,l) & = \begin{cases} A(n,k,l-1) + A(n,k-1,l-1) & \text{if }k = l > 0,\\ A(n,k,l-1) & \text{if } k > l > 0, \\ A(n,k) & \text{if } l = 0. \end{cases} \end{align*}

0
On

The binary numbers of $n$ digits ($n > 0$) are all the binary numbers of $n - 1$ digits with a $0$ prepended, followed by all the binary numbers of $n - 1$ digits with a $1$ prepended.

So all the maximal sequence lengths of the first half of the numbers will be unchanged. In the second half, the first thing (which used to be the all-zero sequence) will go from $0$ to $1$, but the rest of the third quarter will be unchanged, since it is all numbers that start $10\cdots$ and already had at least one $1$.

The fourth quarter is where it gets tricky. Let's split it into the seventh and eighth eighths. The seventh eighth starts with $110$, so the all the $1$s are now $2$s, but all the $2$s or higher remain the same. The eighth eighth starts $111$, so...

I think it's possible to come up with a systematic recursive algorithm this way, essentially looking at where the first $0$ digit is and using this information to calculate later values as modified earlier values – a dynamic programming approach. You store for each $N$ starting at $1$ (or $0$ if you like) a table of how many binary numbers of length $N$ have a longest subsequence of length $k$, and you can calculate the $(N+1)^\text{st}$ entry by adding the $N^\text{th}$ entry to (the $(N-1)^\text{th}$ entry, modified to bump the $k = 0$ case up to $k = 1$) plus (the $(N-2)^\text{th}$ entry, modified to bump the $k = 0$ and $k = 1$ cases up to $k = 2$), plus [...]