How many length-$n$ bitstrings containing $3$ consecutive $0$s and $4$ consecutive $1$s are there?

312 Views Asked by At

How many length-$n$ bitstrings containing $3$ consecutive $0$s and $4$ consecutive $1$s are there?


I thought that $a_n$ can be constructed in three ways:

  1. The strings that contain both three consecutive $0s$ and four consecutive $1s$ and end up with a $0$. Hence, there are $a_{n-1}$ such strings.

  2. The strings that contain both three consecutive $0s$ and four consecutive $1s$ and end up with a $1$. Hence , there are $a_{n-1}$ such strings.

  3. The strings that end up with $\dots1111000$ or $\dots0001111$ . There are $2 \times 2^{n-7} = 2^{n-6}$ such strings.

As a result ,the number of length-$n$ strings that contain both three consecutive $0s$ and four consecutive $1s$ is equal to $$a_n = 2a_{n-1}+2^{n-6}$$ with $a_0, a_1,\dots, a_6 =0$ and $a_7 = 2$. Is this correct? If not, can you write correct closed formula?

3

There are 3 best solutions below

0
On

Here's a recursive approach that might be useful.

Let $b_n$ be the number of length-$n$ bit strings that contain 3 consecutive 0s. By conditioning on the prefix 1, 01, 001, or 000, we find that $$b_n = \begin{cases} 0 &\text{if $n < 3$} \\ b_{n-1} + b_{n-2} + b_{n-3} + 2^{n-3} &\text{if $n \ge 3$} \end{cases} $$ Note that $b_n$ is https://oeis.org/A050231.

Similarly, let $c_n$ be the number of length-$n$ bit strings that contain 4 consecutive 1s. By conditioning on the prefix 0, 10, 110, 1110, or 1111, we find that $$c_n = \begin{cases} 0 &\text{if $n < 4$} \\ c_{n-1} + c_{n-2} + c_{n-3} + c_{n-4} + 2^{n-4} &\text{if $n \ge 4$} \end{cases} $$ Note that $c_n$ is https://oeis.org/A050232.

Now let $a^0_n,a^{00}_n,a^1_n,a^{11}_n,a^{111}_n$ be the number of length-$n$ bit strings that contain 3 consecutive 0s and 4 consecutive 1s and start with 0, 00, 1, 11, and 111, respectively. Then conditioning on whether the next bit is 0 or 1 yields \begin{align} a_n &= a^0_n + a^1_n \\ a^0_n &= a^{00}_n + a^1_{n-1} \\ a^{00}_n &= c_{n-3} + a^1_{n-2} \\ a^1_n &= a^0_{n-1} + a^{11}_n \\ a^{11}_n &= a^0_{n-2} + a^{111}_n \\ a^{111}_n &= a^0_{n-3} + b_{n-4} \end{align}

1
On

Well , first of all we should think about basic set theory . We know that all situations consist of $4$ subcases such that ( containing $3$ consecutive zeros and $4$ consecutive ones ) $\color{red}{\cup}$ ( containing $3$ consecutive zeros and do not contain $4$ consecutive ones ) $\color{red}{\cup}$ (do not contain $3$ consecutive zeros and contain $4$ consecutive ones ) $\color{red}{\cup}$ (do not contain $3$ consecutive zeros and do not contain $4$ consecutive ones ) $\color{blue}{=}$ All situations ,i.e , $2^n$ where $n$ is lenght of the string.

Then , lets call our situations such that

$x_1=$ The number of strings that containing $3$ consecutive zeros and $4$ consecutive ones

$x_2=$ The number of strings that containing $3$ consecutive zeros and do not contain $4$ consecutive ones

$x_3=$ The number of strins that do not contain $3$ consecutive zeros and contain $4$ consecutive ones

$x_4 =$ The number of strings that do not contain $3$ consecutive zeros and do not contain $4$ consecutive ones

We said that $x_1 + x_2 + x_3 + x_4 = 2^n$ ,and we are looking for $x_1$ . If we think these like set operations , it is clear that $x_1 = 2^n - [x_2 +x_3 -x_4]$.

Moreover , $x_2 + x_4 =$ the number of strings that do not contain $4$ consecutive $1's$ , $x_3+x_4 =$ the number of strings that do not contain $3$ consecutive $0's$.

Result , $x_2 + x_3 -x_4 =$ (the number of strings that do not contain $4$ consecutive $1's$ ) + ( the number of strings that do not contain $3$ consecutive $0's$) - (The number of strings that do not contain $3$ consecutive zeros and do not contain $4$ consecutive ones)

The rest is the process of finding generating functions for them by Goulden -jackson . I am not get in elaborately that process , you can find explanation on the stack -exchange about it.

So ,

the generating function for the number of strings that do not contain $4$ consecutive $1's$ : $$\frac{1-x^4}{1-2x+x^5}$$

the generating function for the number of strings that do not contain $3$ consecutive $0's$ : $$\frac{1-x^3}{1-2x+x^4}$$

The generating functions for the number of strings that do not contain $3$ consecutive zeros and do not contain $4$ consecutive ones : $$\frac{1+2x +3x^2 +3x^3 +2x^4 +x^5}{1-x^2 -2x^3 -2x^4 -x^5}$$

The generating function for $2^n$ is $$\frac{1}{1-2x}$$

Then , $$x_1 = \frac{1}{1-2x} - \Biggl[\frac{1-x^4}{1-2x+x^5} + \frac{1-x^3}{1-2x+x^4} - \frac{1+2x +3x^2 +3x^3 +2x^4 +x^5}{1-x^2 -2x^3 -2x^4 -x^5} \Bigg] $$

Hence , $$x_1 = 2x^7 +8x^8 + \color{blue}{26}x^9 +75x^{10}+...$$

$26x^9$ means that there are $26$ string of lenght $9$ that contain $000$ and $1111$

8
On

A convenient starting point is also the generating function of words with no consecutive equal characters at all. These words are called Smirnov or Carlitz words. See for example III.24 Smirnov words in Analytic Combinatorics by Philippe Flajolet and Robert Sedgewick for more information.

The generating function for the number of Smirnov words over a binary alphabet is given by \begin{align*} \left(1-\frac{2z}{1+z}\right)^{-1}=1+2z+2z^2+2z^3+\cdots\tag{1} \end{align*}

We use PIE the principle of inclusion-exclusion in the same way as @bulbasaur. We denote with

  • $a_n$ the number of words of length $n$ which do not contain $000$,

  • $b_n$ the number words of length $n$ which do not contain $1111$,

  • $c_n$ the number of words of length $n$ which do neither contain $000$ nor $1111$.

Since the number of all binary words of length $n$ is $2^n$, the wanted number $d_n$ is given according to PIE: \begin{align*} \color{blue}{d_n=2^n-a_n-b_n+c_n\qquad\qquad n\geq 0}\tag{2} \end{align*}

Bad word $000$:

  • We derive a generating function $A(z)=\sum_{n=0}^\infty a_nz^n$ as follows. We are looking for words which do not contain $000$. Starting from Smirnov words we can substitute each occurrence of $0$ either by $0$ or $00$ and each occurrence of $1$ by runs of $1$ of arbitrary positive length. We consequently substitute \begin{align*} &0:\quad z\quad\to\quad z+z^2\\ &1:\quad z\quad\to\quad z+z^2+z^3+\cdots = \frac{z}{1-z} \end{align*} Putting this in (1) we obtain \begin{align*} \color{blue}{A(z)}=\left(1-\frac{z+z^2}{1+z+z^2}-\frac{\frac{z}{1-z}}{1+\frac{z}{1-z}}\right)^{-1} \color{blue}{=\frac{1-z^3}{1-2z+z^4}} \end{align*}

Bad word $1111$:

  • We derive a generating function $B(z)=\sum_{n=0}^\infty b_nz^n$ as follows. We are looking for words which do not contain $1111$. Starting from Smirnov words we can substitute each occurrence of $1$ either by $1$ or $11$ or $111$ and each occurrence of $0$ by runs of $0$ of arbitrary positive length. We consequently substitute \begin{align*} &0:\quad z\quad\to\quad z+z^2+z^3+\cdots = \frac{z}{1-z}\\ &1:\quad z\quad\to\quad z+z^2+z^3 \end{align*} Putting this in (1) we obtain \begin{align*} \color{blue}{B(z)}=\left(1-\frac{\frac{z}{1-z}}{1+\frac{z}{1-z}}-\frac{z+z^2+z^3}{1+z+z^2+z^3}\right)^{-1} \color{blue}{=\frac{1-z^4}{1-2z+z^5}} \end{align*}

Bad words $000,1111$:

  • We derive a generating function $C(z)=\sum_{n=0}^\infty c_nz^n$ as follows. We are looking for words which do neither contain $000$ nor $1111$. Starting from Smirnov words we can substitute each occurrence of $0$ by $0$ or $00$ and each occurrence of $1$ either by $1$ or $11$ or $111$. We consequently substitute \begin{align*} &0:\quad z\quad\to\quad z+z^2\\ &1:\quad z\quad\to\quad z+z^2+z^3 \end{align*} Putting this in (1) we obtain \begin{align*} \color{blue}{C(z)}&=\left(1-\frac{z+z^2}{1+z+z^2}-\frac{z+z^2+z^3}{1+z+z^2+z^3}\right)^{-1}\\ &\,\,\color{blue}{=\frac{1+2z+3z^2+3z^3+2z^4+z^5}{1-z^2-2z^3-2z^4-z^5}} \end{align*}

A generating function for the number $2^n$ of all binary words of length $n$ is \begin{align*} \frac{1}{1-2z}=1+2z+4z^2+8z^3+\cdots \end{align*}

Putting all together: Finally we obtain the wanted generating function $D(z)$ according to (2) as \begin{align*} \color{blue}{D(z)}&=\frac{1}{1-2z}-A(z)-B(z)+C(z)\\ &=\frac{1}{1-2z}-\frac{1-z^3}{1-2z+z^4}-\frac{1-z^4}{1-2z+z^5}\\ &\qquad\qquad+\frac{1+2z+3z^2+3z^3+2z^4+z^5}{1-z^2-2z^3-2z^4-z^5}\\ &=2x^7+8x^8+\color{blue}{26}x^9+75x^{10}+200x^{11}+\cdots \end{align*} in accordance with the result of @bulbasaur.

Notes:

  • The blue marked $26$ valid words of length $9$ are \begin{align*} \begin{array}{cccc} 000001111&\quad000011110&\quad000011111&\quad000101111\\ 000111100&\quad000111101&\quad000111110&\quad000111111\\ 001111000&\quad010001111&\quad011110000&\quad011110001\\ 011111000&\quad100001111&\quad100011110&\quad100011111\\ 101111000&\quad110001111&\quad111100000&\quad111100001\\ 111100010&\quad111100011&\quad111101000&\quad111110000\\ 111110001&\quad111111000 \end{array} \end{align*}

  • Here the problem is not that hard, since we do not have to cope with overlapping words. The bad words $000$ and $1111$ can be treated separately. In case of overlaps of bad words the cluster method of Goulden and Jackson can show its strength.