The multinomial coefficient and number of unique words

182 Views Asked by At

Let's define $C$ as the number of possible sequences of length $N$ we can form using symbols $A_i$ from an alphabet $A = \{A_1,\ldots , A_M\}.$

To my knowledge, we always have $C = M^{N+1}$.

For example, the total count of numbers containing 3 digits that we can form using decimal digits is $10^4$ = 1000, which would be the cardinality of $\{000, 001, ... 999\}$.

Or for example, the total count of binary numbers we can build with 3 bits is $2^4$ = 16, which would be the cardinality of $\{000, ..., 111\}$.

Separately, I have read that the multinomial coefficient ${n \choose k_{1},k_{2},\ldots ,k_{m}}={\frac {n!}{k_{1}!\,k_{2}!\cdots k_{m}!}}$ can be understood as:

The number of distinct ways to permute a multiset of n elements, and $k_i$ are the multiplicities of each of the distinct elements.

How are these problems and quantities related? Are they one and the same for some choices of $k_i$, $n$, $N$ and $M$?

Put another way, can $C$ be written as a multnomial coefficient?

3

There are 3 best solutions below

1
On BEST ANSWER

Suppose that you have an alphabet $A=\{a_1,\ldots,a_m\}$ of $m$ symbols. Using this alphabet we can form $m^n$ distinct sequences of length $n$ (not $m^{n+1}$). Now suppose that $$k_1+k_2+\ldots+k_m=n\;;$$ the multinomial coefficient $\binom{n}{k_1,\ldots,k_m}$ is the number of distinct sequences of length $n$ that contain exactly $k_1$ instances of $a_1$, $k_2$ instances of $a_2$, and so on. The relationship between the two is then straightforward:

$$m^n=\sum_{\substack{\langle k_1,\ldots,k_m\rangle\in\Bbb N^m\\k_1+\ldots+k_m=n}}\binom{n}{k_1,\ldots,k_m}\;:$$

the sum is taken over all possible counts of the different symbols.

0
On

For example, take $n = 3$, $m=2$, $k_1 = 1$ and $k_2 = 2$. $${3 \choose 1,2} = \frac{3!}{1!2!}=3$$ is the number of $3$-letter words consisting of one "a" and two "b". $$ abb, bab, bba $$

On the other hand, $2^3 = 8$ is the total number of $3$-letter words from this alphabet. This includes the cases $(k_1,k_2) = (3,0), (2,1), (1,2), (0,3)$.

In general, $$ m^n = \sum_{k} {{n} \choose {k_1,\ldots, k_m}}$$ where the sum is over all $m$-tuples of nonnegative integers whose sum is $n$.

2
On

$C$ represents the number of strings you can write without any limitation, the multinomial coefficient gives you the number of strings you can write when using exactly $k_1$ times $A_1$, $k_2$ times $A_2$ and so on. Hence, the sum of all the multinomial coefficients (going over all $k$s such that $\sum k_i = n$ must give you $C$).

Say $n=2$, and the alphabet has two letters: A,B.

Total number of words ($C$): AA,AB,BA,BB.

$k_1=2,k_2=0$: AA

$k_1=1,k_2=1$: AB,BA

$k_1=0,k_2=2$: BB