Using generating functions to answer how many bit strings of length N have no 000

1.3k Views Asked by At

The Problem

I've been self-studying Introduction to Analysis of Algorithms by Sedgewick and Flajolet. I'm on the fifth chapter, and struggling with exercise 5.1:

How many bit strings of length N have no 000?

More specifically, the authors want a closed form expression counting the number of bit strings of length N which have no 000 as a substring.

My Plan of Attack

I believe the authors intend the problem to be solved in these steps:

  1. Find an "Analytic Combinatorial" expression for the problem statement
  2. Convert the aforementioned expression into a generating function
  3. Find a corresponding series for the generating function; in particular, find a closed form for the coefficients

The closed form for the coefficients will give me the answer I seek. I believe I have either:

  • found a needlessly complicated analytic combinatorial expression, or
  • do not understand generating functions enough to solve the generating function

My Work Thus Far

The Analytic Combinatoric Expression

I arrived at this form for the generating function:

$$ \mathcal{G} = \varepsilon + \mathcal{Z}_0 + \mathcal{Z}_0 \times \mathcal{Z}_0 + (\mathcal{Z}_1 + \mathcal{Z}_0 \times \mathcal{Z}_1 + \mathcal{Z}_0 \times \mathcal{Z}_0 \times \mathcal{Z}_1) \times \mathcal{G} $$

I've used Sedgewick and Flajolet's notation where $\mathcal{G}$ corresponds to the set in which I'm interested, $\mathcal{Z_x}$ corresponds to the singleton set containing the object $x$, and $\varepsilon$ is a "neutral object of size zero".

I justify this expression as follows. Strings that don't have three consecutive zeros can have three different configurations of their three leading bits:

  1. $001$
  2. $01X$
  3. $1XX$

I've used $X$ to indicate either $0$ or $1$.

Therefore, any shorter string not containing three consecutive zeros can be extended by prepending either $1$, $01$, or $001$.

I also add $0$ and $00$ since I cannot generate them using the previously mentioned scheme.

I confirmed that I can generate all length three strings except $000$ using this analytic combinatoric expression:

  1. $101 = 1 \cdot 01 \cdot \varepsilon$
  2. $100 = 1 \cdot 00$
  3. $011 = 01 \cdot 1 \cdot \varepsilon$
  4. $010 = 01 \cdot 0$
  5. $111 = 1 \cdot 1 \cdot 1 \cdot \varepsilon$
  6. $110 = 1 \cdot 1 \cdot 0$
  7. $001 = 001 \cdot \varepsilon$

The Generating Function

From the analytic combinatoric expression I derived this generating function:

$$ G(z) = \frac{1 + z + z^2}{1 - z - z^2 - z^3} $$

And this is where I've become quite stuck. AFAIK, the denominator, $1 - z - z^2 - z^3$ does not have any "simple" roots. I'd really like to factor it into a series of binomials of the form $(1 - cz)$ so that I can apply some of the ordinary generating functions machinery we studied in chapter three.

Being analytically stuck, I turned to Wolfram Alpha which helpfully points out that the full expression has two simple roots:

$$ -\sqrt[3]{-1} $$ and $$ (-1)^{2/3} $$

I don't know their multiplicity. I thought about writing the expression as:

$$ (1 - z(-1)^{-1/3})(1 - z(-1)^{-2/3}) $$

But I don't know how to apply generating function machinery to this expresion.

Wolfram Alpha also points out that the Taylor series expansion at 0 is:

$$ 1 + 2z + 4z^2 + 7z^3 + 13z^4 + O(z^5) $$

Which lead me to try directly computing the Taylor series of the generating function. I didn't see any obvious pattern in the derivatives, but perhaps I missed something. It seemed like I was headed for a huge mess of fractions.

Conclusion

I'd really like a confirmation that the generating analytic combinatoric expression I derived is correct. I'd also really appreciate some pointers about calculating the series corresponding to my generating function.

Thanks!

1

There are 1 best solutions below

3
On BEST ANSWER

First of all your approach is fine and the generating function $G(z)$ is correct. As verification we consider a slightly different approach in the same spirit as it can be found in chapter $5$ in Analysis of Algorithms. Then we look at the coefficients of $G(z)$

Generating function $G(z)$:

We can find in section Bitstrings of chapter $5$ a representation of all bitstrings as \begin{align*} B=\varepsilon+(Z_0+Z_1)\times B \end{align*} which means that each bitstring is either empty or a string starting with $0$ or $1$ followed by a bitstring. A combinatorial construction is \begin{align*} B=SEQ(Z_0+Z_1) \end{align*} meaning a bitstring is a sequence of zero or more occurrences of $0$ or $1$ respectively.

We can characterise general bitstrings also by grouping them into blocks starting with $1$. A bitstring consists of zero or more blocks starting with $1$ and followed by zero or more $0$'s. The bitstring may be preceded by zero or more $0$'s. A combinatorial representation is \begin{align*} B=SEQ(Z_0)SEQ(Z_1\,SEQ(Z_0))\tag{1} \end{align*} The generating function describing the number of bitstrings according to (1) is \begin{align*} B(z)=\frac{1}{1-z}\frac{1}{1-z\frac{1}{1-z}}=\frac{1}{1-2z}=\sum_{n\geq 0}(2z)^n \end{align*} showing us, that we have $2^n$ different bitstrings of length $n$. We can now derive from (1) a representation for all strings which do not contain the substring $000$.

We describe the bitstrings not containing a substring $000$ as a general bitstring with the restriction that each group of $0$'s has maximum length $2$. This implies that we have to exchange in (1) \begin{align*} SEQ(Z_0)=\varepsilon+Z_0+Z_0Z_0+\cdots\qquad\text{with}\qquad \varepsilon+Z_0+Z_0Z_0 \end{align*} We obtain \begin{align*} G=(\varepsilon+Z_0+Z_0Z_0)SEQ(Z_1\,(\varepsilon+Z_0+Z_0Z_0))\tag{2} \end{align*} From (2) we obtain the generating function $G(z)$ as \begin{align*} G(z)&=(1+z+z^2)\frac{1}{1-z(1+z+z^2)}\\ &=\frac{1+z+z^2}{1-z-z^2-z^3}\\ &=1+2z+4z^2+7z^4+13z^5+\cdots \end{align*} which is the same as yours.

$$ $$

Coefficients of $G(z)$:

It's convenient to use the coefficient of operator $[z^n]$ to denote the coefficient of $z^n$ in a generating series.

In this case it is not necessary to derive the coefficients $[z^n]$ in $G(z)$ by finding the roots of the equation $1-z-z^2-z^3$. We instead use a geometric series representation of $G(z)$. Since $1+z+z^2=\frac{1-z^3}{1-z}$ we obtain \begin{align*} G(z)&=\frac{1+z+z^2}{1-z-z^2-z^3}\\ &=\frac{1+z+z^2}{1-z(1+z+z^2)}\\ &=\frac{\frac{1-z^3}{1-z}}{1-z\frac{1-z^3}{1-z}}\\ &=\frac{1-z^3}{1-2z+z^4}\\ &=\frac{1-z^3}{1-(2z-z^4)}\\ &=(1-z^3)\sum_{k\geq 0}(2z-z^4)^k\\ &=(1-z^3)\sum_{k\geq 0}z^k(2-z^3)^k\\ &=(1-z^3)\sum_{k\geq 0}z^k\sum_{l=0}^k\binom{k}{l}(-x^3)^l2^{k-l}\tag{3} \end{align*}

We calculate from (3) the coefficient of $z^n$ of $G(z)$ as

\begin{align*} [z^n]G(z)&=[z^n](1-z^3)\sum_{k\geq 0}z^k\sum_{l=0}^k\binom{k}{l}(-z^3)^l2^{k-l}\\ &=\left([z^n]-[z^{n-3}]\right)\sum_{k\geq 0}z^k\sum_{l=0}^k\binom{k}{l}(-z^3)^l2^{k-l}\tag{4}\\ &=\sum_{k\geq 0}\left([z^{n-k}]-[z^{n-3-k}]\right)\sum_{l=0}^k\binom{k}{l}(-z^3)^l2^{k-l}\\ &=\sum_{{k= 0}\atop{n\equiv k(\text{mod} 3)}}^n\binom{k}{\frac{n-k}{3}}(-z^3)^\frac{n-k}{3}2^{k-\frac{n-k}{3}}\tag{5}\\ &\qquad-\sum_{{k= 0}\atop{n\equiv k(\text{mod} 3)}}^{n}\binom{k}{\frac{n-k}{3}-1}(-1)^{\frac{n-k}{3}-1}2^{k-\frac{n-k}{3}+1}\\ &=\sum_{{k= 0}\atop{n\equiv k(\text{mod} 3)}}^n\left(\binom{k}{\frac{n-k}{3}}+2\binom{k}{\frac{n-k}{3}-1}\right) (-1)^\frac{n-k}{3}2^{k-\frac{n-k}{3}}\tag{6} \end{align*}

Comment:

  • In (4) we use the linearity of the coefficient of operator and the rule $[z^{n+k}]=[z^n]x^{-k}$

  • In (5) we select the coefficient $l=n-k$ which are congruent $0$ modulo $3$ due to the third power of $z$. We also set the upper limit of the sum to $n$ since we only need coefficients up to $z^n$.

With the formula in (6) we can explicitly calculate the coefficients of $G(z)$.