Bitstrings and set of solutions

Question

Bitstrings and set of solutions

347 Views Asked by Bumbble Comm At 09 Apr 2026 - 3:36

There is something about counting bitstrings and the format of the solutions I didn't really understand yet.

Given a bitstring problem that asks to elementarily count how many bitstrings of length 36 there are that contain exactly 5 $\{1\}$ in the first 10 posititions and the $\{10100101\}$ substring in the last 20 positions, finding it out is pretty straightforward.

Since the first 10 positions need to contain exactly 5 $\{1\}$, you find out the number of all the possible combinations containing maximum five ones:

$${10 \choose 5}$$

We have no information about the next 16 positions, so that is $2^{16}$ more combinations, and thus far we have:

$${10 \choose 5} \cdot 2^{16}$$

many combinations.

Now the substring:

there are $\{10100101\} \cdot 2^{12}$ bitstrings with no overlapping; $20 - 8 + 1 = 13 \rightarrow {13 \choose 1} \cdot 2^{12}$
there are $\{1010010100101\} \cdot 2^{7}$ bitstrings with single overlapping, same goes as you overlap the starting $1$ and the one at the end, so ${8 \choose 1} \cdot 2^{7}, {6 \choose 1} \cdot 2^{5}$
there are $2\cdot \{10100101\} \cdot 2^{4}$ bitstrings as the substring repeats itself once, so ${6 \choose 2} \cdot 2^{4}$
there are $\{101001010010100101\} \cdot 2^{2}$ bitstrings with double overlapping, and considering also all the other possibilities you have ${3 \choose 1} \cdot 2^{2}, {1 \choose 1} \cdot 2^{0}, {1 \choose 1} \cdot 2^{0}$

Since $\{10100101\} \cdot 2^{12}$ "spans" all the other combinations below and are thus included within it, shouldn't we remove all the overlapping and repeating substrings from the universe $\{10100101\} \cdot 2^{12}$? Given the number of substrings above, how should I then build the solution?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

In order to calculate the number of binary strings of length $20$ containing the substring $10100101$ pretty much of all the hard work is already done. We just have to use the inclusion-exclusion principle and put all the intermediate results together.

The number of wanted strings is \begin{align*} &\binom{13}{1}2^{12}-\left(\binom{8}{1}2^7+\binom{6}{1}2^5+\binom{6}{2}2^4\right) +\left(\binom{3}{1}2^2+\binom{1}{1}2^0+\binom{1}{1}2^0\right)\\ &\qquad=53\,248-(1024+192+240)+(12+1+1)\\ &\qquad\,\,\color{blue}{=51\,806} \end{align*} which counts the number of binary strings of length $20$

containing the substring $10100101$ at least once

minus the number of strings where the substring occurs at least twice, including overlaps.

Since we have subtraced strings where the substring occurs at least three times (here in form of overlaps only) too often we have to add them for compensation.

We can check the result by applying another approach called the Goulden-Jackson Cluster Method.

We consider the set words of length $n\geq 0$ built from an alphabet $$\mathcal{V}=\{0,1\}$$ and the set $B=\{010100101\}$ of bad words, which are not allowed to be part of the words we are looking for. We derive a generating function $f(s)$ with the coefficient of $s^n$ being the number of searched words of length $n$.

According to the paper (p.7) the generating function $f(s)$ is \begin{align*} f(s)=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\tag{1} \end{align*} with $d=|\mathcal{V}|=2$, the size of the alphabet and $\mathcal{C}$ the weight-numerator of bad words with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[10100101]) \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[10100101])&=-s^8-s^5\text{weight}(\mathcal{C}[10100101])\\ &\qquad\quad-s^7\text{weight}(\mathcal{C}[10100101])\\ \end{align*}

and get \begin{align*} \text{weight}(\mathcal{C})=-\frac{s^8}{1+s^5+s^7} \end{align*}

It follows

\begin{align*} f(s)&=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-2s+\frac{s^8}{1+s^5+s^7}}\\ &=\frac{1+s^5+s^7}{1-2s+s^5-2s^6+s^7-s^8}\\ \end{align*}

The coefficient $[s^n]$ in $f(s)$ gives the number of binary words of length $n$ which does not contain $10100101$. Since we want to count the number of words of length $20$ which do contain the word we take the generating function \begin{align*} \frac{1}{1-2s}=1+2s+4s^2+8s^3+\cdots \end{align*} which counts all binary words and subtract $f(s)$ from it.

We obtain with some help of Wolfram Alpha \begin{align*} \frac{1}{1-2s}-\frac{1+s^5+s^7}{1-2s+s^5-2s^6+s^7-s^8}&=s^8+4s^9+12s^{10}+32s^{11}+\cdots\\ &\qquad+\color{blue}{51\,806s^{20}}+\cdots \end{align*} in accordance with the calculation above.

Bitstrings and set of solutions

There are 1 best solutions below

Related Questions in INCLUSION-EXCLUSION

Related Questions in BIT-STRINGS

Trending Questions

Popular # Hahtags

Popular Questions