Generating series of binary strings that don't contain substring $1100$

2.4k Views Asked by At

I am working on a combinatorics assignment and one of the questions that is asked is

Find the generating series of binary strings of length $n$ that do not contain the substring $1100$.

I started by saying: Let $S$ be the set of all binary strings with no occurences of substring $1100$. Then $S = 0^*(1\{e, 0, 10\})^*$.

However, I'm unsure as to whether this set $S$ is the correct one. If it is correct, I can find the generating series, but I would like clarification as to whether this is the correct set.

2

There are 2 best solutions below

0
On BEST ANSWER

I think the strings can be defined out of two type of "words".

$a=(100^*)\qquad 10,100,1000,10000,...$

$b=(111^*0)\qquad 110,1110,11110,111110,...$

The criterion is to split the string where a one occurs just after a zero.

  • a one followed by a zero, can be followed by as many zeroes as we want, this is $a$
  • a one followed by at least another one should avoid the $1100$ pattern, thus it can be followed by just another single zero, this is $b$.

So since $a$ and $b$ are starting with $1$ the $1100$ substring cannot be produced.

$\color{green}{0000}\color{red}110\color{red}10\color{red}110\color{red}1000 \color{red}1000\color{red}10\color{red}10\color{red}11110\color{red}100 \color{red}1111110\color{red}1110\color{red}1000\color{red}100\color{red}111110\color{red}100\color{red}100\color{red}10\color{red}110 \color{red}100\color{red}100000\\\color{red}110\color{red}10\color{red}100000 \color{red}10000\color{red}111110\color{red}110\color{red}10\color{red}10 \color{red}110\color{red}110\color{red}100\color{red}11110\color{red}110 \color{red}100\color{red}1110\color{red}1110\color{red}110\color{red}10\color{blue}{1111111}$

Thus we have $\{a,b\}^*$

To complete the representation we also have to take care of:

  • the prefix in green, which are leading zeroes $(0^*)$
  • the suffix in blue, which are trailing ones, $(1^*)$

Note, we do not have to care about trailing zeroes, because it is already accounted into $a$, and we also do not have to care about a possibly ultimate zero because it is already accounted in $b$.

To arrive to $S=0^*\{(100^*),(111^*0)\}^*\,1^*$

I hope I have made no mistake, this is a bit complicated to deal with all the cases.


So if I have understood what a generating function for a regexp is, it should be: $g(x)=\left(\dfrac 1{1-x}\right)\left(\dfrac 1{1-\left(\left(\dfrac {x^2}{1-x}\right)+\left(\dfrac{x^3}{1-x}\right)\right)}\right)\left(\dfrac 1{1-x}\right)=\dfrac 1{(1-x)(1-x-x^2-x^3)}\\\phantom{g(x)}=\dfrac 1{1-2x+x^4}$

0
On

We have a Markov chain/NFA with four states. The starting state is $S_0$ or anything goes. If we are in $S_0$ and we read a $0$ we stay there, if we read a $1$ we go to the state $S_1$ associated with the potentially dangerous prefix $1$. In $S_1$, if we read a $1$ we go to the state $S_2$ associated with the more dangerous prefix $11$, otherwise we return to $S_0$. In $S_2$, if we read a $0$ we go to the state $S_3$ associated with the very dangerous prefix $110$, otherwise we stay in $S_2$. In $S_3$, if we read a $0$ we discard the string, otherwise we return to $S_1$.

$\hspace{1in}$enter image description here

Now there is an algorithmic way for converting a NFA (non-deterministic finite automaton) into a regular expression for the regular language accepted by such NFA. In $S_0$ we may read $0^*$ and continue staying in $S_0$. We move out from $S_0$ and go to $S_1$ when reading $0^*1$. At this point the string ends, we return to $S_0$ by reading $0$, we go to $S_2$ to stay there forever, or we go through $S_2$ to die in $S_3$ or to return to $S_1$: we may return to $S_1$ only by reading $11^*01$. The wanted language $L$ is so given by

$$ \text{[I need to elaborate it carefully]} $$ $$ 0^*\left[1\left(\{e,11^*e, 11^*0e, 11^* 0 1 \}\right)^* 0\right]^*$$

Let $S_n$ the set of such words with length $n$ and let $L_n=|S_n|$. A closed form for $L_n$ is given by the $n$-th power of the following transition matrix: $$ M=\begin{pmatrix}1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 0 \\ 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 \end{pmatrix}$$ whose eigenvalues are given by $1$ and by the distinct roots $\zeta_1,\zeta_2,\zeta_3$ of the polynomial $p(z)=z^3-z^2-z-1$. By the Hamilton-Cayley theorem $M$ and $\{L_n\}_{n\geq 0}$ share the same characteristic polynomial, $q(z)=z^4-2z^3+1$, hence we have $$ L_{n+4} = 2 L_{n+3}-L_{n} $$ and the generating function for $L_n$ is of the form $$ \text{GF}(x)=\sum_{n\geq 0}L_n x^n = \frac{h(x)}{1-2x+x^4} $$ where $h(x)$ is a polynomial with degree $\leq 4$ whose coefficients can be found by interpolation.
We simply have $h(x)\equiv 1$ and the asymptotic behaviour of $L_n$ is $\approx (1.357147)\cdot(1.839287)^{n}$.