A variation on counting Balanced Brackets

1k Views Asked by At

While counting the number of balanced bracket expressions of length $2n$, the constraint is that for every prefix substring: $$\text{[number of occurrences of (]} - \text{[number of occurrences of )]} \geq 0.$$

I have a more general version of this problem. Count all valid expressions of length $k$ such that for every prefix substring: $$\text{[number of occurrences of (]} - \text{[number of occurrences of )]} ∈ [X, Y]$$ (and no other constraints).

For example, if $X = -3$ and $Y = 3$, then

  • )))((( is valid, ((())) is valid, ()))) is valid

  • ()))))(((( is not valid, (((()))) is not valid.

Any help would be appreciated.

1

There are 1 best solutions below

0
On

Your problem is actually a bit easier than the original problem (at least theoretically) since the generating function of the numbers you are looking for is a rational series.

Let me first give a general approach and then give an example. Let $A = \{a, \bar a\}$ and let $A^*$ be set of words on $A$, including the empty word $1$. Given a word $v$, denote by $|v|_a$ (respectively $|v|_{\bar a}$) the number of occurrences of $a$ (respectively $\bar a$) in $v$. Intutitively, $a$ will represent an open parenthesis and $\bar a$ a closing one. Let $$ L = \{\ u \in A^* \mid \text{ for each prefix $v$ of $u$, } x \leqslant |v|_{a} - |v|_{\bar a} \leqslant y\ \} $$ Now $L$ is a regular language, accepted by the following deterministic (incomplete) automaton $\mathcal{A} = (Q, A, ., 0, F)$ where

  1. $Q = \{ q \in \mathbb{Z} \mid x \leqslant q \leqslant y\ \}$
  2. $q \cdot a = q+1$ if $q < y$ and $q \cdot {\bar a} = q-1$ if $q > x$, undefined otherwise.
  3. $F = Q$

It suffices now to find the number of words of length $k$ in $L$, which can be done by standard algorithms: first compute an unambiguous regular expression for $L$ and then convert it into a one-variable generating function by taking $a = \bar a = t$. This will give you the series $S = \sum_{k \in \mathbb{N}} c_kt^k$, where $c_k = |L \cap A^k|$ is the number of words of length $k$ in $L$.

For instance, for $x = -1$ and $y = 2$, you will end up with the unambiguous regular expression $$ L = (a(a{\bar a})^* + {\bar a}a)^*(a(a{\bar a})^*(a + 1) + {\bar a} + 1) $$ leading to the generating series $$ S = \frac{1+2t-t^3}{1-3t^2+t^4} $$ Now, using Sage or you favorite mathematics software system you would get

taylor (S, t, 0, 10);

144*t^10 + 89*t^9 + 55*t^8 + 34*t^7 + 21*t^6 + 13*t^5 + 8*t^4 + 5*t^3 + 3*t^2 + 2*t + 1

For instance, there are 13 words of length 5 in $L$: $aa{\bar a}a{\bar a}$, $aa{\bar a}{\bar a}a$, $aa{\bar a}{\bar a}{\bar a}$, $a{\bar a}aa{\bar a}$, $a{\bar a}a{\bar a}a$, $a{\bar a}a{\bar a}{\bar a}$, $a{\bar a}{\bar a}aa$, $a{\bar a}{\bar a}a{\bar a}$, ${\bar a}aaa{\bar a}$, ${\bar a}aa{\bar a}a$, ${\bar a}aa{\bar a}{\bar a}$, ${\bar a}a{\bar a}aa$ and ${\bar a}a{\bar a}a{\bar a}$.