Number of distinct letters in an order-invariant tuple

56 Views Asked by At

Let a language $\mathcal{L}$ have $N$ letters.

Now, consider the following tuple:

$$X = (x_1, x_2, \ldots, x_k).$$

Each $x_i$ is a letter chosen from $\mathcal{L}$ and repetitions are allowed.

We have one additional constraint: the ordering of the letters does not matter. So, if $p$ and $q$ are two letters belonging to $\mathcal{L}$, and if $k = 2$, $(p, q)$ and $(q, p)$ are considered the same tuple and counted only once. However, $(p, p)$ and $(q, q)$ are two different tuples.

Similarly, define another tuple

$$Y = (y_1, y_2, \ldots, y_k).$$

Now, consider the combined tuple

$$ Z = (X, Y).$$

Here's a constraint for $Z$: for the "$X$ part" and the "$Y$ part" of $Z$ respectively, ordering does not matter. However, the two parts "do not mix."

For example, let $n = r = k= 2$.

Then $Z = (0, 1, 0, 1)$ and $(1, 0, 1, 0)$ are counted as a single tuple as they are generated by $X = (0, 1)$ (same as $X = (1, 0)$) and $Y =(0, 1)$ (same as $Y = (1, 0)$). However, $Z = (0, 1, 0, 1)$ and $Z = (1, 1, 0, 0)$ (generated by $X = (1, 1)$ and $Y = (0, 0)$) are different tuples.

For some $r \in [2k]$, how many $Z$s are there with exactly $r$ distinct letters such that the cases for which $X = Y$ are excluded?

For example, $Z = (0, 0, 1, 1)$ is a valid tuple, but $Z = (1, 0, 1, 0)$ is not.


I feel like we need to invoke the "stars-and-bars" argument somehow here, just because order does not matter. But I could not make this formal.

1

There are 1 best solutions below

2
On BEST ANSWER

There are $\binom Nr$ ways to choose the $r$ distinct letters appearing in $X$ and $Y$. From now on, suppose this set of $r$ letters has been chosen, and we want to count the number of ways to choose $X$ and $Y$ using all $r$ of these chosen letters.

Let $a_1,\dots,a_r$ be the number of times each of these letters appears in $X$. This means $$a_1+\dots+a_r=k\tag 1$$ The number of ways to choose $X$, using stars and bars, is therefore $\binom{k+r-1}{k}$. There are the same number of ways to choose $Y$, so the number of ordered pairs $(X,Y)$ is $\binom{k+r-1}{k}^2$. However, these include selections where both $X$ and $Y$ are missing one of the $r$ letters. For each letter, we need to subtract away the bad pairs where that letter is missing. This requires using the principle of inclusion exclusion.

I will sketch the details of the inclusion exclusion argument. To count the number of choices of $(X,Y)$ where the first letter is missing, you need to count the solutions to $(1)$ where $a_1=0$. This is equivalent to counting solutions to $a_2+\dots+a_k=0$, which again by stars and bars is $\binom{k+(r-1)-1}{k}$. You need to choose $Y$ as well, so square that. We also need to subtract the cases where $a_2=0$, then where $a_3=0$, etc, so we multiply this by $r$. We are currently at $$ \binom{k+r-1}k^2-r\binom{k+r-2}{k}^2 $$ However, this is not yet correct. The problem is that solutions where two variables are zero have been subtracted twice, so they need to be added back in. Using the same idea, for each $i\neq j$, we add in the solutions to $(1)$ where $a_i=a_j=0$. There are effectively two fewer variables, so there are $\binom{k+r-3}{k}^2$ ways to choose $(X,Y)$. Multiplying by $\binom r2$ for the ways to choose the "bad" indices $i$ and $j$, we are now at $$ \binom{k+r-1}k^2-r\binom{k+r-2}{k}^2+\binom r2 \binom{k+r-3}{k}^2 $$ Continuing this pattern through the triple intersections, quadruple intersections, etc, the result is $$ \#\text{ ways} =\sum_{i=0}^r (-1)^i\binom{r}i\binom{k+r-i-1}{k}^2 $$ However, this analysis includes cases where $X=Y$, so we need to subtract these all away. If a pair $(X,Y)$ where $X=Y$ uses all $r$ letters, then that means $X$ uses all $r$ letters. The number of ways to choose such an $X$ is the number of solutions to $a_1+\dots+a_r=k$ where $a_i$ is a positive integer for each $i\in [r]$. By stars and bars, the number of ways is $\binom{k-1}{r-1}$.

Putting this all together, the number of ways to choose $Z$ is $$ \binom Nr\cdot\left[\left[\sum_{i=0}^r (-1)^i\binom{r}i\binom{k+r-i-1}{k}^2\right]-\binom{k-1}{r-1}\right] $$