A self-convolution formula that counts bracket expressions

379 Views Asked by At

Problem: Consider an alphabet of size $m+2$, consisting of the two bracket symbols $\ [ \ ] \ $ plus $m$ non-bracket symbols ($m \ge 0$). Define $f_m(n)$ to be the number of length-$n$ strings on this alphabet, in which brackets may occur only in the usual "balanced" manner of possibly-nested parentheses. We want a formula for $f_m(n)$.

How to prove the following formula? (Is there some combinatorial derivation?)

Conjecture: If $m,n$ are nonnegative integers, then $f_m(n) = a(n+2)$, where $a()$ is the sequence defined by the following recursion:

$$\begin{array}{lcl} a(1) & = & m/2 \\ a(2) & = & 1 \\ a(i) & = & a(1)a(i-1) + a(2)a(i-2) + \cdots + a(i-1)a(1), \ \ \ i \ge 3 \end{array}$$

(The $a(i)$ should be written $a_m(i)$ to show they depend on $m$, but this is suppressed for simplicity.)

In other words, for each $m$ the infinite sequence $(f_m(0), \ f_m(1) , \ f_m(2), \ \dots)$ is generated by starting with the initial segment $(m/2, \ f_m(0))$ in an auxilliary sequence, then computing each additional element as the self-convolution of its prefix.

Example: $f_1(4) = 9$. There are 9 admissible length-4 strings on an alphabet with 1 non-bracket symbol (say $x$); namely, $\ xxxx, \ xx[], \ x[x], \ [xx], \ x[]x, \ [x]x, \ []xx, \ [[]], \ [][]$. The formula gives
$f_1(4) = a(4+2) = 9$ in the sequence that starts with ($1/2, 1$):

$$\begin{array}{lcl} 1/2 \\ 1 \\ (1/2)(1)+(1)(1/2)=1 \\ (1/2)(1)+(1)(1)+(1)(1/2)=2 \\ (1/2)(2)+(1)(1)+(1)(1)+(2)(1/2)=4 \\ (1/2)(4)+(1)(2)+(1)(1)+(2)(1)+(4)(1/2)=9 \end{array}$$

Motivation: In certain primitive computer languages (e.g., P'' and its brainfuck variations), programs are just the kind of strings we're considering, and I wanted to count the length-n programs. (For P'', there are $m = 3$ non-bracket symbols, and for brainfuck $m = 6$.) Here's a table of some $f_m(n)$ values computed by brute-force enumeration:

Some f_m(n) values computed by enumeration

m   OEIS     n = 0  1   2    3     4     5      6       7       8        9     10
--  -------  --------------------------------------------------------------------
0   A126120      1  0   1    0     2     0      5       0      14        0     42
1   A001006      1  1   2    4     9    21     51     127     323      835   2188
2   A000108      1  2   5   14    42   132    429    1430    4862    16796  58786
3   A002212      1  3  10   36   137   543   2219    9285   39587   171369
4   A005572      1  4  17   76   354  1704   8421   42508  218318
5   A182401      1  5  26  140   777  4425  25755  152675
6   A025230      1  6  37  234  1514  9996  67181  458562

Every row of this table corresponds to a sequence found at OEIS, but I've found no reference connecting all of them to a single combinatorial problem (e.g., our "bracket string" counting problem) nor generating all of them from a single parametric formula (e.g., our conjecture).

Aside: It may seem peculiar that the formula uses half-integers to generate integer sequences (for odd $m$). I think this is part of a general pattern in which an initial segment of some number of rationals may be used to parameterize a family of rational sequences generated from the initial segment by self-convolution. If the initial segment is a half-integer followed by some integers, the result will always be an integer sequence.

1

There are 1 best solutions below

1
On BEST ANSWER

Your formula is correct, but perhaps not expressed in the most enlightening way. Write it instead as $\displaystyle f_m(n) = mf_m(n-1) + \sum_{k=0}^{n-2} f_m(k)f_m(n-2-k)$. The first term corresponds to those strings which begin with a non-bracket character. Otherwise, the first character is an left bracket, followed by some number of characters (a string of this type of some length $k$) followed by a right bracket, followed by a string of this type of length $n-2-k$ for a total length of $n$. (This is the same idea as in the proof of the Catalan recurrence.)

Alternatively, one has $\displaystyle f_m(n) = \sum_{k=0}^{\lfloor \frac{n}{2}\rfloor} \binom{n}{2k}m^{n-2k}C_k$, where $C_k$ are the Catalan numbers.