Counting the number of bit strings containing a substring

1.9k Views Asked by At

Consider a general bit string of length $n$; how many bit strings are there that contain a substring $T$?

For example, given a bit string of length 6, how many are there that contain $110$ as a substring, and how many that contain $101$?

2

There are 2 best solutions below

0
On BEST ANSWER

Given the comments, I think you should check the article here : https://oeis.org/A005251 in OEIS.

The mentioned recurrence formula is : a(n) = a(n-1) + a(n-2) + a(n-4) and it may be obtained by taking into account four kind of sequences:

$A_n$ : good sequences of length n that finish in 00

$B_n$ : good sequences of length n that finish in 01

$C_n$ : good sequences of length n that finish in 10

$D_n$ : good sequences of length n that finish in 11

and see what happens if we add one digit.

their sum $E_{n+1}$ would be expressed :

$$E_{n+1} = A_{n+1} + B_{n+1}+C_{n+1}+D_{n+1} = (A_n+B_n+C_n+D_n) + A_{n-1} + B_{n-1} + D_{n-1} = ....... =E_n + E_{n-1} + E_{n-3}$$

where the first four E's are 4, 7, 12, 21 for n = 2, 3, 4, 5 and they have to be obtained by hand.

0
On

Again, given the work around strings in the last days, there is a simple method to obtain a GF for the strings avoiding 110.

Suppose that we generate strings in the language $L = (1+0+110)^*$. There are two ways of obtaining the bad 110 in the process of building strings.

$\alpha = \ $ the hard way, by adding 0's and 1' to some previous string

$\beta = \ $ the short way, by adding directly a 110 to some previous string

for example, a string that contains two occurrences of 110 (there are no overlappings)

$w = ....110...110...$ may be obtained in four ways, depending on what method was used to derive each 110.

We have a multiplicity of 4 for $w$.

If we use a $-$ sign in the expression $(x+y-xxy)$ , the coefficient of the algebraic sum that describes the multiplicity of w changes from $4$ to $0$, because $1-1-1+1=0$. The four cases are $\alpha\alpha,\alpha\beta, \beta\alpha,\beta\beta $ where $\alpha\alpha$ means that the two $110's$ of $w$ were obtained the hard way.

So happens for other strings with multiplicities of 8, 16, 32...

It remains to write the GF

$1+(x+y-xxy)+(x+y-xxy)^2+... = \frac{1}{1 -x - y + xxy}$

We have obtained https://oeis.org/A000071, the Fibonacci numbers - 1 in OEIS.