How many $a$-nary sequences of length $b$ never have $c$ consecutive occurrences of a digit?

292 Views Asked by At

Let $S(a,b,c): = \#\{a$-nary sequences of length $b$ without $c$ consecutive occurrences of a digit$\}$.

For example, $S(2,n,3)$ would be the number of binary sequences of length $n$ without $3$ consecutive occurences of a $0$ or $1$. In particular, $S(2,n,3) = 2f_{n+1}$, twice the $(n+1)$st Fibonacci number.

A bit more about this example can be found below, but my question is:

How many $a$-nary sequences of length $b$ never have $c$ consecutive occurrences of a digit?

Re-phrased,

What is a concise formula for $S(a,b,c)$?

Although there are related questions that have been asked on MSE, I have not seen this particular one asked in full generality.


For $S(2,n,3)$, let $f_n$ denote the $n$th Fibonacci number and observe the following:

$n = 1$: $\{0, 1\}$. Total: $2$, i.e., $2f_{2} = 2\cdot1$.

$n = 2$: $\{00, 01, 10, 11\}.$ Total: $4$, i.e., $2f_{3} = 2\cdot2$.

$n = 3$: $\{001, 010, 011, 100, 101, 110\}.$ Total: $6$, i.e., $2f_{4} = 2\cdot3$.

$n = 4$: $\{0010, 0011, 0100, 0101, 0110, 1001, 1010, 1011, 1100, 1101\}.$ Total: $10$, i.e., $2f_{5} = 2\cdot5$.

I've written out a slightly wordy proof that the above pattern continues, but omit it here for the sake of brevity. (If it would be helpful, then I could include it in the post; but an answer to my main question would, of course, subsume this particular result.)

3

There are 3 best solutions below

3
On BEST ANSWER

For problems like this the Goulden-Jackson Cluster Method is a convenient method to derive a generating function. The coefficients of this function are the wanted numbers.

We consider the set of words in $ \mathcal{V}^{\star}$ of length $b\geq 0$ built from an alphabet $$\mathcal{V}=\{x_1,\ldots,x_a\}$$ and the set $B=\{x_1^c,\ldots,x_a^c\}$ of bad words, which are not allowed to be part of the words we are looking for. We derive a function $f(s)$ with the coefficient of $s^b$ being the number of words of length $b$ from an $a$-ary alphabet with runs of length $<c$.

According to the paper (p.7) the generating function $f(s)$ is \begin{align*} f(s)=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\tag{1} \end{align*} with $d=|\mathcal{V}|=a$, the size of the alphabet and with the weight-numerator $\mathcal{C}$ with \begin{align*} \text{weight}(\mathcal{C})=\text{weight}(\mathcal{C}[x_1^c])+\cdots+\text{weight}(\mathcal{C}[x_a^c]) \end{align*}

We calculate according to the paper \begin{align*} \text{weight}(\mathcal{C}[x_1^c])&=-s^c-\text{weight}(\mathcal{C}[x_1^c])s-\cdots-\text{weight}(\mathcal{C}[x_1^c])s^{c-1}\\ &=-\frac{s^c}{1+s+\cdots+s^{c-1}}=-\frac{s^c(1-s)}{1-s^c}\\ &\cdots\\ \text{weight}(\mathcal{C}[x_a^c])&=-s^c-\text{weight}(\mathcal{C}[x_a^c])s-\cdots-\text{weight}(\mathcal{C}[x_a^c])s^{c-1}\\ &=-\frac{s^c}{1+s+\cdots+s^{c-1}}=-\frac{s^c(1-s)}{1-s^c}\\ \end{align*} and get \begin{align*} \text{weight}(\mathcal{C}[x_1^c])=\cdots=\text{weight}(\mathcal{C}[x_a^c])=-\frac{s^c(1-s)}{1-s^c} \end{align*}

It follows \begin{align*} f(s)&=\frac{1}{1-ds-\text{weight}(\mathcal{C})}\\ &=\frac{1}{1-as+a\frac{s^c(1-s)}{1-s^c}}\\ &=\frac{1-s^c}{1-as+(a-1)s^{c}} \end{align*} and we conclude \begin{align*} S(a,b,c)=[s^b]\frac{1-s^c}{1-as+(a-1)s^{c}} \end{align*}

Example: $a=2,c=3$

In the special case with a binary alphabet $a=2$ and runs of maximum length $<c=3$ we obtain the generating function $f(s)$

\begin{align*} f(s)&=\frac{1-s^3}{1-2s+s^3}\\ &=\frac{1+s+s^2}{1-s-s^2}\\ &=1+2s+4s^2+6s^3+10s^4+16s^5+\mathcal{O}(s^6) \end{align*}

4
On

For $a=2$, you get the c-bonacci numbers.

More generally, you just set up the recurrence relation and solve it for $F_b$.

$$F_n = (a-1)[F_{n-1}+ \ldots +F_{n-c}]$$

With a starting condition of $F_1 = a, F_0 = 1, F_{-i} =0$

From the theory of linear recurrences, we know that $F_n = F_n = \sum A_i \alpha_i^n$, where $\alpha_i$ are the roots of the characteristic equation. However, since it doesn't have immediately obvious roots, I would be surprised if there is a compact general answer.

2
On

In Odlyzko "Enumeration of strings" (in "Combinatorial Algorithms on Words", Apostolico and Galil (eds), Springer, 1985) one of the problems treated is a generalization of this one. The result is that the generating function for the numbers you are looking for is: $$ \sum_{b \ge 0} S(a, b, c) z^b = \frac{1 - z^c}{1 - a z + a z^{c + 1}} $$ What could be done from here is to expand the denominator as a geometric series in $a z - a z^{c + 1}$, but that won't give any kind of closed form.

The Fibonacci numbers $F_0 = 0, F_1 = 1, F_{n + 2} = F_{n+1} + F_n$ have generating function $F(z) = \frac{z}{1 - z - z^2}$. So you have: $$ \sum_{n \ge 0} n F_n z^n = z \frac{\mathrm{d} F}{\mathrm{d} z} = \frac{z (1 + z^2)}{(1 - z - z^2)^2} $$ Your values above are just a coincidence.

Added later: The terms of the sequence can be recovered from the partial fraction expansion of the generating function, each single zero of that gives a term of the form $a \rho^n$ if $1/\rho$ is a zero of the denominator, and multiple zeros give polynomials in $n$ multiplied by the $\rho^n$. Some terms might not be present for particular initial values (which get encoded in the denominator).

Note that this is the closed form of the sequence, there can't be no other (in essence, the coefficients of the Maclaurin series are unique). For particular values of $a$ and $c$ there are known expressions for the zeroes (up to $c = 3$, via the formula for the cuartic). I'm sure the relevant polynomial $1 - a z + a z^{c + 1}$ has no expression for the zeros in roots (at least for $c = 4$ the resulting quintic hasn't). So at least for that particular case (and very probably for all higher $c$ too) there is no "nice formula" (and a monster with $c + 1$ terms isn't very "nice" anyway).