Permutations with simultaneous repetition and restriction(subset length)

638 Views Asked by At

I'm trying to figure out a problem, but I'm unable to find the formula to use, and not knowledgeable enough to try to derive it myself.

The problem is that I have a string of between 10-15 characters, which may contain duplicates. I want to calculate how many 10 character strings I can make from it that are unique. The strings are random text snippets, for ex lets say i have: $$(parallelfoods)$$

The math as I've understood it: I have a multiset of $n$ items with $n_1$ of kind 1, $n_2$ of kind 2. I want to calculate the number of length $k$ tuples from this multiset (where $k<n$). So in the example it's: $$n=13$$ $$n_p=1, n_a=2, n_r=1, n_l=3\ldots$$ $$k=10$$

My progress: I have been able to learn from several sites how to calculate these two cases separately. The duplicate items is solved by: $$\frac{n!}{(n_1!n_2!\ldots)}$$

Similarly I understand that the limiting to $k$ items is solved using: $$\frac{n!}{(n-k)!}$$

Where I've come to a halt is when I need to combine the two as the problem most often will require both of them.

In the example the multiset $(parallelfoods)$ has $n=13$ (that is $n>k$) as well as several items that are duplicate.

How can I calculate the number of possible unique $k$ length tuples of the multiset without having to generate all of them?

1

There are 1 best solutions below

0
On

The phrase $parallel foods$ can be viewed as the multiset $$\{1 \cdot p, 2 \cdot a, 1 \cdot r, 3 \cdot l, 1 \cdot e, 1 \cdot f, 2 \cdot o, 1 \cdot d, 1 \cdot s\}$$ We wish to count the number of strings of length $10$ that can be formed with the letters of this multiset.

We consider cases:

  1. Nine distinct letters are used, including one letter that appears twice.
  2. Eight distinct letters are used, including one letter that appears three times.
  3. Eight distinct letters are used, including two letters that each appear twice.
  4. Seven distinct letters are used, including one letter that appears three times and one other letter that appears twice.
  5. Seven distinct letters are used, including three letters that each appear twice.
  6. Six distinct letters are used, including one letter that appears three times and two other letters that each appear twice.

Case 1: Nine distinct letters are used, including one letter that appears twice.

Choose which of the three letters that have multiplicity greater than one will be used twice. Choose two of the ten positions for that letter. Arrange the other eight distinct letters in the multiset in the remaining eight positions. There are $$\binom{3}{1}\binom{10}{2}8!$$ such arrangements.

Case 2: Eight distinct letters are used, including one letter that appears three times.

The only letter that can appear three times is $l$. Choose three of the ten positions in the string for the $l$s. If $l$ appears three times, then one of the other eight letters must not appear in the string. Choose which letter to exclude. Arrange the remaining seven distinct letters in the multiset in the remaining seven positions. There are $$\binom{10}{3}\binom{8}{1}7!$$ such arrangements.

Case 3: Eight distinct letters are used, including two letters that each appear twice.

Choose which two of the three letters with multiplicity greater than one will appear twice. Choose two of the ten positions in the string for the selected letter that appears first in an alphabetical list. Choose two of the remaining eight positions in the string for the other letter that will appear twice. If these two letters are each used twice, one of the other seven letters in the multiset will not appear in the string at all. Choose which letter will be excluded. Arrange the remaining six distinct letters in the multiset in the remaining six positions of the string. There are $$\binom{3}{2}\binom{10}{2}\binom{7}{1}6!$$ such arrangements.

Case 4: Seven distinct letters are used, including one letter that appears three times and one other letter that appears twice.

The letter that appears three times must be $l$. Choose three of the ten positions in the string for the $l$s. Choose whether $a$ or $o$ appears twice. Choose two of the remaining seven positions in the string for that letter. If $l$ appears three times and another letter appears twice, two of the other seven letters in the multiset must not appear at all. Choose which two of them to exclude. Arrange the remaining five distinct letters of the multiset in the remaining five positions in the string. There are $$\binom{10}{3}\binom{7}{2}\binom{7}{2}5!$$ such arrangements.

Case 5: Seven distinct letters are used, including three letters that each appear twice.

The three letters that each appear twice must be $a$, $l$, and $o$. Choose two of the ten positions in the string for the $a$s, two of the remaining eight positions in the string for for the $l$s, and two of the remaining six positions in the string for the $o$s. Since $a$, $l$, and $o$ each appear twice, two of the other six letters in the multiset must not appear at all in the string. Choose which two letters to exclude. Arrange the remaining four distinct letters of the multiset in the remaining four positions. There are $$\binom{10}{2}\binom{8}{2}\binom{6}{2}\binom{6}{2}4!$$ such arrangements.

Case 6: Six distinct letters are used, including one letter that appears three times and two other letters that each appear twice.

The letter that appears three times must be $l$. The letters that each appear exactly twice must be $a$ and $o$. Choose three of the ten positions for the $l$s, two of the remaining seven positions for the $a$s, and two of the remaining five positions for the $o$s. Since $l$ appears three times and $a$ and $o$ each appear twice, three of the other six letters in the multiset must not appear at all. Choose which three letters to exclude. Arrange the remaining three letters of the multiset in the remaining three positions. There are $$\binom{10}{3}\binom{7}{2}\binom{5}{2}\binom{6}{3}3!$$ such arrangements.

Total: Since the six cases outlined above are mutually exclusive and exhaustive, the number of distinguishable strings of length $10$ can be found by adding the above cases.

Note: This problem could also be approached using exponential generating functions.