Number of words where each letter is adjacent to the same letter.

265 Views Asked by At

Let $a_n$ (for $n \geq 2$) denote the number of words of length $n$ over a 7-letter alphabet in which each letter is the same as the previous or next one. Find a compact formula for $a_n$.

The conclusion is that each letter must appear in a "group" of the same letters of size 2 or more.

My first approach was by recursion and I got:

$$a_n = 7a_{n-2} + 7a_{n-3}, a_0 = 1, a_1 = 0, a_2 = 7$$

since every string of length $n$ can be created from a string of length $n-2$ by adding two identical letters, or from a string of length $n-3$ by adding three identical letters. We can achieve a group of any length in this way. Each time, we choose the color of the added letters in 7 ways.

The second way is to use stars and bars, dividing into elements $\geq 2$. At the start, we can assign 2 elements to each of the $k$ groups, so we have the standard stars and bars but for $n-2k$ elements:

$$\binom{n-2k+k-1}{k-1} = \binom{n-k-1}{k-1}.$$

Then we sum over all possible numbers of groups and assign a color to each group in 7 ways:

$$a_n = \sum_{k=1}^{\left\lfloor\frac{n}{2}\right\rfloor}\binom{n-k-1}{k-1} \cdot 7^k.$$

There are at most $\left\lfloor\frac{n}{2}\right\rfloor$ groups.

However, first of all, these two solutions give different values, and secondly, I suspect that they count some things more than once, because numbers they give are slightly larger than when manually counting the possibilities.

Is any of these formulas good? If not, what could be the way to find the number of these sequences? Thanks.

4

There are 4 best solutions below

1
On BEST ANSWER

The recurrence $a_n=7a_{n-2}+7a_{n-3}$ is incorrect, because it overcounts. For example, $\mathsf{AAAA}$ can be obtained in two ways by your method; either by starting with $\mathsf{AA}$ and appending $\mathsf{AA}$, or starting with $\mathsf{A}$ and appending $\mathsf{AAA}$.

A correct recurrence (which I got from Henry's comment) is $$ a_n=a_{n-1}+6a_{n-2}. \tag{$n\ge 4$}, $$ with the base cases $a_2=a_3=7$. This is because any sequence of length $n$ can be obtained in exactly one of two ways.

  • You can start with a valid sequence of length $n-1$, and append a copy of the last character. These sequences will all end with a group of $3$ or more, and every sequence ending in a group of $3$ or more is obtained this way.

  • You can start with a valid sequence of length $n-2$, whose last letter is $x$, and append a pair of identical letters which are unequal to $x$. These sequence end with a group of $2$, and all such sequences can be obtained this way.

For the stars-and-bars method, you are also overcounting. You are correct that the number of ways to partition the spots into $k$ contiguous groups is $\binom{n-k-1}{k-1}$. However, the number of ways to assign numbers to these groups is not $7^k$, but is instead $7\cdot 6^{k-1}$. This is because adjacent groups must receive different numbers (else they would be in the same group). The leftmost group has $7$ choices, but then each subsequent group has only $6$ choices, because each group cannot be assigned the same number as the group on its left. Therefore, the correct expression is $$ 7\sum_{k=1}^{\lfloor n/2\rfloor } \binom{n-k-1}{k-1}6^{k-1} $$

This summation agrees with the recurrence $a_n=a_{n-1}+6a_{n-2}$.

0
On

One argument is to condition on where the first change in letters takes place: assume the letters are the digits $1, 2, \ldots, 7$ and let $b_n$ be the number of ways when starting with 1.

Then (for $n\ge4$)$$b_n= 1 \mbox{ [all letters are 1s] } + 6 \times b_2 \mbox{ [all but the last two letters are 1s] }$$ $$+\; 6\times b_3 \mbox{ [all but the last three are 1s] }+ \ldots . $$ But the first change cannot take place after the first letter, so the last term corresponds to all but the last $n-2$ letters being $1$s, and so , $$b_n= 1 + 6(b_2 + b_3 + \ldots b_{n-2}).$$ But then $$b_{n-1} = 1 + 6(b_2 + b_3 + \ldots b_{n-3})$$ and $$b_n - b_{n-1} = 6b_{n-2},$$ provided $n\ge 5$.

This is a standard recurrence and has the solution $$b_n=\frac{1}{5}\big(3^{n-1}-(-2)^{n-1}\big),$$ for $n\ge 2$, in fact.

0
On

starting with $a_n=a_{n-1}+6a_{n-2}$ with $a_2=a_3=7$, I get... $$a_n = C_1(3)^n+C_2(-2)^n$$ which gives rise to the system... $$ 9C_1+4C_2 = 7$$ $$ 27C_1-8C_2=7$$

The solution is $C_1 = \frac 7{15}, C_2=\frac 7{10}$

So the simplified solution works out to... $$a_n = \frac 75 \bigg [ 3^{n-1}-(-2)^{n-1} \bigg ]$$

Which agrees with the answer given by @Mike Earnest but is arguably more compact.

0
On

This contribution is more of a supplement, the approach of which could also be useful for similar problems. We consider a $7$-letter alphabet. Words which do not have any consecutive equal letters are called Smirnov words. A generating function for Smirnov words of a $7$-letter alphabet is given as \begin{align*} \left(1-7\frac{z}{1+z}\right)^{-1}\tag{1} \end{align*} The coefficient $[z^n]$ of $z^n$ in the series (1) gives the number of $7$-ary words of length $n$ which do not have any consecutive letters.

In the current example we are looking for words which do have at least two consecutive letters. We therefore substitute occurrences of $z$ by \begin{align*} z\quad&\to\quad z^2+z^3+z^4+\cdots=\frac{z^2}{1+z}\tag{2}\\ \end{align*} We derive (with some help of Wolfram Alpha) from (1) and (2) the generating function \begin{align*} \color{blue}{A(z)}&=\left(1-\frac{7\frac{z^2}{1-z}}{1+\frac{z^2}{1-z}}\right)^{-1}\\ &=\frac{1-z+z^2}{1-z-6z^2}\\ &\,\,\color{blue}{=\frac{7}{15}\,\frac{1}{1-3z}+\frac{7}{10}\,\frac{1}{1+2z}-\frac{1}{6}}\\ &\,\,\color{blue}{=1+7z^2+7z^3+49z^4+91z^5+385z^6+\cdots} \end{align*} in accordance with the results from other answers.

Note: Smirnow words can be found for instance in example III.24 in Analytic Combinatorics by P. Flajolet and R. Sedgewick.