Expected Number of Bit Chunks

109 Views Asked by At

We begin with a string of zeros of length K > 1. We perform N random "flips" by choosing an index from 1 to K uniformly at random and flipping the bits from that index through index K, e.g.,

00000 -> 00111 -> 00110 -> ...

Define the number of chunks in a string as the number of continuous sequences of zeros and ones, e.g., 0011100000 has 3 chunks.

Show that the expected number of chunks as N goes to infinity is K/2.

EDIT: Thanks for suggesting that I put the work I've already done in the post. Will do from now on. So far, I've created a KxK transition matrix A in which the $A_{i,j}$ entry is the probability that a K bit string with i chunks has j chunks after another random flip. A has form:

\begin{pmatrix} \frac{1}{k} & \frac{k-1}{k} & 0 & \cdots & 0 \\ \frac{1}{k} & \frac{1}{k} & \frac{k-2}{k} & \cdots & 0 \\ 0 & \frac{2}{k} & \frac{1}{k} & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \frac{k-1}{k} & \frac{1}{k} \end{pmatrix}

Taking $A^m$ as m goes to infinity yields a matrix whose rows have weights that form a normal distribution centered at K/2. For example for K = 5 we have:

\begin{pmatrix} \frac{1}{5} & \frac{4}{5} & 0 & 0 & 0 \\ \frac{1}{5} & \frac{1}{5} & \frac{3}{5} & 0 & 0 \\ 0 & \frac{2}{5} & \frac{1}{5} & \frac{2}{5} & 0 \\ 0 & 0 & \frac{3}{5} & \frac{1}{5} & \frac{1}{5} \\ 0 & 0 & 0 & \frac{4}{5} & \frac{1}{5} \end{pmatrix}

Taking $A^{50}$ gives rows

\begin{pmatrix} 0.0625 & 0.25 & 0.375 & 0.25 & 0.0625 \end{pmatrix}

As another example with larger K, taking $K=50$ and $A^m$ for $m=500$ gives a 50 by 50 matrix whose rows each give the (seemingly normal) distribution:

K=50

The logic behind the entries in matrix A is that there is only one index that can maintain the bit string's chunk number after a flip (the first index), so the probabilities on the diagonal are all $\frac{1}{k}$. A bit string can increase chunk number by one or decrease chunk number by one on any given flip, so everything other than the diagonal and the two off-diagonals will be 0. The off diagonals are obtained by noticing that if you choose an index whose bit value differs from the index before it, flipping from that index lowers the number of chunks in the string by one. Any other index (other than the first), will increase the chunk number by one.

Why does this distribution happen? Is it really a normal distribution? How could I use this to prove the K/2 chunk theorem?