Generalization for Catalan number

382 Views Asked by At

I know that the number of ways to open and close $n$ parenthesis is the $n$-th Catalan number. What about the number of sequences of open and closed parenthesis of length $k$ that contains exactly $n$ well open and closed parenthesis?

Example: ))())() is a sequence of length 7 that contains exactly 2 well open and closed parenthesis.

Basic fact:

The number of sequences of open and closed parenthesis of length $k$ is equal to $2^k;$

The number of sequences of open and closed parenthesis of length $k$ that contains exactly $n$ well open and closed parenthesis is obviously greater that the the $n$-th Catalan number, but which is the precise value?

1

There are 1 best solutions below

2
On BEST ANSWER

We start by computing the relevant generating functions. Taking a binary string or string of parentheses we extend all well-balanced subsequences to be maximal. This leaves additional filling which has the property that it may not contain an open parentheses followed by a closing one. We parameterize the filling by the variable $q$ indicating its length, so that there are $q+1$ admissible fillings of length $q$ with $q+1$ places for insertion of a well-balanced non-empty word. We furthermore have the Catalan number generating function

$$C(z) = \frac{1-\sqrt{1-4z}}{2z}.$$

Inserting into the filling we find the mixed OGF of binary words with pairs of well-balanced parentheses marked which is (variable $z$ for word length and $u$ for pairs of parentheses)

$$\sum_{q\ge 0} (q+1) z^q \sum_{p=0}^{q+1} {q+1\choose p} (C(uz^2)-1)^p \\ = \sum_{q\ge 0} (q+1) z^q C(uz^2)^{q+1} = \frac{C(uz^2)}{(1-zC(uz^2))^2}.$$

In this interpretation of the problem we have that an open parenthesis paired with a closed one with a sequence between them that is not well-balanced do not count. The binomial coefficient indicates the choice of slots in the filling where a non-empty well-balanced word is placed. This step may be skipped if we admit the empty well-balanced word in which case we proceed to the next line.

Now to extract coefficients from this we recall the functional equation of $C(z)$ which is

$$C(z) = 1 + z C(z)^2 \quad\text{or}\quad z = \frac{C(z)-1}{C(z)^2}.$$

We do the coefficient on $[u^k]$ first ($k$ pairs of parentheses) and view $C(uz^2)$ as a function of $u$ which yields

$$u = \frac{1}{z^2} \frac{C(uz^2)-1}{C(uz^2)^2}.$$

The coefficient extractor here is

$$[u^k] \frac{C(uz^2)}{(1-zC(uz^2))^2} = \frac{1}{2\pi i} \int_{|u|=\epsilon} \frac{1}{u^{k+1}} \frac{C(uz^2)}{(1-zC(uz^2))^2} \; du.$$

We let $w=C(uz^2)$ so that $u=(w-1)/w^2/z^2$ and

$$du = \frac{1}{z^2} \left(\frac{1}{w^2} - 2\frac{w-1}{w^3}\right) \; dw = \frac{1}{z^2} \frac{2-w}{w^3} \; dw$$

which yields for the integral

$$\frac{1}{2\pi i} \int_{|w-1|=\gamma} z^{2k+2} \frac{w^{2k+2}}{(w-1)^{k+1}} \frac{w}{(1-zw)^2} \frac{1}{z^2} \frac{2-w}{w^3} \; dw \\ = \frac{z^{2k}}{2\pi i} \int_{|w-1|=\gamma} \frac{w^{2k}}{(w-1)^{k+1}} \frac{1}{(1-zw)^2} (2-w) \; dw.$$

We get two pieces here which are

$$\frac{z^{2k}}{2\pi i} \int_{|w-1|=\gamma} \frac{w^{2k}}{(w-1)^{k+1}} \frac{1}{(1-zw)^2} \; dw$$

and

$$-\frac{z^{2k}}{2\pi i} \int_{|w-1|=\gamma} \frac{w^{2k}}{(w-1)^{k}} \frac{1}{(1-zw)^2} \; dw.$$

We obviously require where $p\le k$

$$\frac{1}{p!} \left(w^{2k} \frac{1}{(1-zw)^2}\right)^{(p)}$$

This is by Leibniz

$$\frac{1}{p!} \sum_{q=0}^p {p\choose q} \frac{(2k)!}{(2k-q)!} w^{2k-q} \frac{z^{p-q} (p-q+1)!}{(1-zw)^{p-q+2}} \\ = \sum_{q=0}^p {2k\choose q} w^{2k-q} \frac{z^{p-q} (p-q+1)}{(1-zw)^{p-q+2}}.$$

Evaluate at $w=1$ to get

$$\sum_{q=0}^p {2k\choose q} \frac{z^{p-q} (p-q+1)}{(1-z)^{p-q+2}}.$$

The contribution from the two integrals now becomes

$${2k\choose k} \frac{z^{2k}}{(1-z)^2} + z^{2k} \sum_{q=0}^{k-1} {2k\choose q} \left(\frac{z^{k-q} (k-q+1)}{(1-z)^{k-q+2}} - \frac{z^{k-1-q} (k-q)}{(1-z)^{k-q+1}}\right) \\ = {2k\choose k} \frac{z^{2k}}{(1-z)^2} \\ + z^{2k} \sum_{q=0}^{k-1} {2k\choose q} \left(\frac{z^{k-q} (k-q+1) - z^{k-1-q} (k-q) + z^{k-q} (k-q) }{(1-z)^{k-q+2}}\right) \\ = {2k\choose k} \frac{z^{2k}}{(1-z)^2} \\ + z^{2k} \sum_{q=0}^{k-1} {2k\choose q} \left(\frac{z^{k-q} (2k-2q+1) - z^{k-1-q} (k-q) } {(1-z)^{k-q+2}}\right) \\ = z^{2k} \sum_{q=0}^{k} {2k\choose q} \frac{z^{k-q} (2k-2q+1) - z^{k-1-q} (k-q) } {(1-z)^{k-q+2}}.$$

Doing the coefficient extraction on $[z^n]$ (word contains $n$ parentheses in total) we get two pieces, namely

$$\sum_{q=0}^{k} {2k\choose q} [z^{n-3k+q}] \frac{2k-2q+1} {(1-z)^{k-q+2}} \quad\text{and}\quad -\sum_{q=0}^{k} {2k\choose q} [z^{n-3k+q+1}] \frac{k-q} {(1-z)^{k-q+2}}$$

which produce

$$\sum_{q=0}^{k} {2k\choose q} (2k-2q+1) {n-2k+1\choose k-q+1} \quad\text{and}\quad -\sum_{q=0}^{k} {2k\choose q} (k-q) {n-2k+2 \choose k-q+1}.$$

In order to join these we put

$${n-2k+1 \choose k-q+1} = \frac{n-3k+q+1} {n-2k+2} {n-2k+2 \choose k-q+1}$$

to get

$$\sum_{q=0}^{k} {2k\choose q} {n-2k+2\choose k-q+1} \frac{(k-q+1)(n+2q+1-4k)}{n-2k+2}$$

and arrive at the closed form where clearly $n\ge 2k:$

$$\bbox[5px,border:2px solid #00A000]{ \sum_{q=0}^{k} {2k\choose q} {n-2k+1\choose k-q} (n+2q+1-4k).}$$

As a sanity check when $n=2k$ we should get ordinary Catalan numbers. We obtain

$$\sum_{q=0}^{k} {2k\choose q} {1\choose k-q} (2q+1-2k) = {2k\choose k} - {2k\choose k-1} = \frac{1}{k+1} {2k\choose k}$$

and the check goes through. We verified all of the above in Maple including an enumeration routine, which is suprisingly effective even on strings of, say, $16$ parentheses. Here is the generating function for pairs of parenthesis among $8$ parentheses:

$$14\,{u}^{4}+84\,{u}^{3}+100\,{u}^{2}+49\,u+9$$

and here it is for $10$ parentheses:

$$42\,{u}^{5}+270\,{u}^{4}+375\,{u}^{3}+245\,{u}^{2}+81\,u+11.$$

For $11$ parentheses we get

$$264\,{u}^{5}+660\,{u}^{4}+660\,{u}^{3}+352\,{u}^{2}+100\,u+12.$$

The leading term is not a Catalan number when $n$ is odd. This was the Maple code:

with(combinat);

CAT := n -> 1/(n+1)*binomial(2*n,n);

CAT_PARS :=
proc(n)
option remember;
local recurse, res;

    res := [];

    recurse :=
    proc(sofar, oc, cc)
        if oc = 0 and cc = 0 then
            res := [op(res), sofar];
            return;
        fi;

        if oc > 0 then
            recurse([op(sofar), `(`], oc-1, cc+1);
        fi;
        if cc > 0 then
            recurse([op(sofar), `)`], oc, cc-1);
        fi;
    end;

    recurse([], n, 0);
    convert(res, `set`);
end;

BALCOUNT :=
proc(parlist)
local posA, posB, intervA, intervB, len, subseq, count,
    res, n;

    subseq := [];

    n := nops(parlist);

    for posA to n do
        for posB from posA+1 to n do
            intervA := parlist[posA..posB];
            len := posB - posA + 1;

            if type(len, even) and
            intervA in CAT_PARS(len/2) then
                subseq := [op(subseq), [posA, posB]];
            fi;
        od;
    od;

    res := [];

    subseq :=
    sort(subseq,
         (a, b) -> a[2]-a[1] < b[2]-b[1]);
    count := nops(subseq);

    for posA to count do
        intervA := subseq[posA];
        for posB from posA+1 to count do
            intervB := subseq[posB];

            if intervB[1] <= intervA[1] and
            intervB[2] >= intervA[2] then
                break;
            fi;
        od;

        if posB = count + 1 then
            res := [op(res), intervA];
        fi;
    od;

    add((iv[2]-iv[1]+1)/2, iv in res);
end;

GFP :=
proc(n)
option remember;
local gf, idx, d, parlist;

    gf := 0;

    for idx from 2^n to 2*2^n-1 do
        d := convert(idx, base, 2);

        parlist :=
        subs([0=`(`, 1=`)`], d[1..n]);

        gf := gf + u^BALCOUNT(parlist);
    od;

    gf;
end;

GFA :=
proc(n)
local C;

    C := (1-sqrt(1-4*u*z^2))/(2*u*z^2);

    coeftayl(C/(1-z*C)^2, z=0, n);
end;

GFB :=
n -> add(u^k*add(binomial(2*k,q)*binomial(n-2*k+1, k-q)
         *(n+2*q+1-4*k), q=0..k), k=0..floor(n/2));

GFC :=
n -> add(u^k*(n+1-2*k)^2/(n+1)*binomial(n+1,k),
         k=0..floor(n/2));

Additional simplification. Splitting the sum into two parts we get for the first part

$$(n+1-4k) \sum_{q=0}^{k} {2k\choose q} {n-2k+1\choose k-q} \\ = (n+1-4k) \sum_{q=0}^{k} {2k\choose q} [z^{k-q}] (1+z)^{n-2k+1} \\ = (n+1-4k) [z^k] (1+z)^{n-2k+1} \sum_{q=0}^{k} {2k\choose q} z^q.$$

Now we may extend $q$ beyond $k$ because the sum term does not contribute to the coefficient extractor in that case, getting

$$(n+1-4k) [z^k] (1+z)^{n-2k+1} \sum_{q\ge 0} {2k\choose q} z^q \\ = (n+1-4k) [z^k] (1+z)^{n-2k+1} (1+z)^{2k} = (n+1-4k) [z^k] (1+z)^{n+1} = (n+1-4k) {n+1\choose k}.$$

The second part is

$$ 2\sum_{q=0}^{k} {2k\choose q} q {n-2k+1\choose k-q} = 2\sum_{q=1}^{k} {2k\choose q} q {n-2k+1\choose k-q} \\ = 4k\sum_{q=1}^{k} {2k-1\choose q-1} {n-2k+1\choose k-q} = 4k\sum_{q=0}^{k-1} {2k-1\choose q} {n-2k+1\choose k-q-1} \\ = 4k\sum_{q=0}^{k-1} {2k-1\choose q} [z^{k-1-q}] (1+z)^{n-2k+1} \\ = 4k [z^{k-1}] (1+z)^{n-2k+1} \sum_{q=0}^{k-1} {2k-1\choose q} z^q = 4k [z^{k-1}] (1+z)^{n-2k+1} \sum_{q\ge 0} {2k-1\choose q} z^q \\ = 4k [z^{k-1}] (1+z)^{n-2k+1} (1+z)^{2k-1} = 4k [z^{k-1}] (1+z)^{n} = 4k {n\choose k-1} \\ = 4k {n+1\choose k} \frac{k}{n+1}.$$

Adding these two we get the second closed form

$$\bbox[5px,border:2px solid #00A000]{ \frac{(n+1-2k)^2}{n+1} {n+1\choose k}.}$$

It is now immediate that we obtain a Catalan number when $n=2k$ since

$$\frac{1}{2k+1} {2k+1\choose k} =\frac{1}{2k+1} {2k\choose k} \frac{2k+1}{k+1} = \frac{1}{k+1} {2k\choose k}.$$

We keep in mind here that even though the closed form is defined including for $2k\gt n$ we must have that $2k\le n$ since $k$ balanced parenthesis require at least $2k$ characters, with a balanced parenthesis participating in exactly one pair. Observe that the closed form is symmetric with respect to $k\to n+1-k.$

The algorithm to generate well-balanced words of parentheses seems to have appeared several times at stackexchange sites, one reference is this link at stackoverflow com.