How to count pairs with Catalan distribution

89 Views Asked by At

Say I have a set of $n$ pairs, such that I have a total of $2n$ elements. I arrange them in pairs following a Catalan distribution, i.e. if I lay them in a 1D line, I have no crossing (See the following picture). My task is: Given a boundary of length L in this line, count the mean number of cut bonds.

enter image description here

Here I define $Q(n,L)$ as the mean number of cuts, i.e. $Q=\frac{\#cuts}{\#permutations}$

My progress: Given n pairs, the total number of permutations is given by the Catalan number $C_n=\frac{(2n)!}{(n+1)!n!}$. As I always have $L$ elements and $(2n-L)$ elements on the left/right side of the boundary, the mean number of cuts would be $Q(n,L)=L(2n-L)\frac{1}{C_n}$, i.e. all possible pairings distributed in a Catalan way. However, this does not work! For the examples shown before, I get Q(2,1)=3/2 Q(2,2)=4/2 and Q(3,1)=5/5, Q(3,2)=8/5, and Q(3,3)=9/5. Somehow I am overestimating the number of cuts and I cannot figure out how to solve it.

Any help would be appreciated.

1

There are 1 best solutions below

0
On BEST ANSWER

The quantity $L(2n-L)\frac1{C_n}$ assumes every possible cut bond has a $\frac1{C_n}$ chance of existing, which is incorrect for two reasons:

  1. The reason responsible for the overcount you've seen for small $n$: cut bonds with an odd number of points under them have no chance of existing at all: you can't match up those points.
  2. An opposite effect: some cut bonds appear multiple times. In the $n=3$ example, the bond connecting the first and last point appears in two pairing, so it contributes twice to every $Q(n,L)$.

To avoid this, let's begin by summing over the number of points under the cut bond: this can be $2k$ for $k$ ranging from $0$ to $n-1$.

How many bonds of this length crossing the boundary at $L$ can be formed? Well, the left point of the bond must be one of the first $L$ points, with two further restrictions:

  • The left point cannot be before point $L-2k$, or else the bond will not actually be cut by the boundary at $L$.
  • The left point cannot be after point $2n-2k-1$, or else the right point would need to be after point $2n$ to be placed correctly.

This gives us $\min\{L,2n-2k-1\} - \max\{1,L-2k\}+1$ possible cut bonds of this length.

For each one of them, there are $C_k$ ways to match up the $2k$ points under the bond, and $C_{n-k-1}$ ways to match up the $2n-2k-2$ points to either side of the bond. So the probabability that this bond is cut is $\frac{C_k C_{n-k-1}}{C_n}$.

In the end, we get the sum $$ Q(n,L) = \sum_{k=0}^{n-1} (\min\{L,2n-2k-1\} - \max\{1,L-2k\}+1) \cdot \frac{C_k C_{n-k-1}}{C_n}. $$ As a special case, this sum for $Q(n,1)$ will simplify to $1$ for all $n$. The boundary at $L=1$ will always cut exactly one bond: the bond whose left end is at point $1$.