How many ways we can partition a multiset, where each part/segment in the partition has distinct elements?

245 Views Asked by At

We define the set S as $\{(s_1, f_1), (s_2, f_2), ..., (s_i, f_i)\}$, where each $f_i$ is the frequency that $s_i$ is repeated in the multiset T. How many ways can we partition the multiset T into different partitions, where each partition includes distinct part/segments?

For example, given $S = \{(s_1, 1), (s_2, 2), (s_3, 1)\}$ results in the multiset $T = \{s_1, s_2, s_2, s_3\}$, where s1 is repeated once, s2 is repeated twice, and s3 repeated once in the multiset. Now, we can partition the multiset into six different partitions, where each partition includes distinct parts/segments as follows:

1:$\{\{s_1,s_2\},\{s_2,s_3\}\}$

2:$\{\{s_1,s_2,s_3\},\{s_2\}\}$

3:$\{\{s_1,s_2\},\{s_2\},\{s_3\}\}$

4:$\{\{s_1\},\{s_2\},\{s_2,s_3\}\}$

5:$\{\{s_1,s_3\},\{s_2\}, \{s_2\}\}$

6:$\{\{s_1\},\{s_2\},\{s_2\},\{s_3\}\}$

  • Note that the order of elements in each segment does not matter, e.g., $\{s_1, s_2\} = \{s_2, s_1\}$
  • Note that each subset has different/distinct elements.

Now, I am looking for a formula that counts the number of partitions of the set T, where $S = \{(s_1, f_1), (s_2, f_2), ..., (s_i, f_i)\}$. I am curious to know if it is possible to compute it in terms of the number of elements in the set S and the corresponding frequencies.

1

There are 1 best solutions below

6
On

This problem can be solved using the Polya Enumeration Theorem. We get very straightforwardly that the desired quantity is given by where $F = f_1+f_2+\cdots+f_n$

$$\sum_{k=1}^F [A_1^{f_1} A_2^{f_2} \times\cdots\times A_n^{f_n}] Z\left(S_k; -1 + \prod_{q=1}^n (1+A_q)\right).$$

where we refer to the cycle index of the symmetric group. We now use the recurrence by Lovasz for the cycle index $Z(S_k)$ of the multiset operator $\def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}} \textsc{MSET}_{=k}$ on $k$ slots, which is

$$Z(S_k) = \frac{1}{k} \sum_{\ell=1}^k a_\ell Z(S_{k-\ell}) \quad\text{where}\quad Z(S_0) = 1.$$

Next we introduce $T(Q, k)$ where $Q$ is a monomial in the variables $A_q$ and $k$ is a non-negative integer and

$$T(Q, k) = [Q] Z(S_k; -1 + \prod_{A\in Q} (1+A)).$$

We also put

$$S(Q) = -1 + \prod_{A\in Q} (1+A).$$

These are the sets that go into the $k$ slots. We thus obtain from the Lovasz recurrence a recurrence for $T:$

$$T(Q, k) = \frac{1}{k} \sum_{\ell=1}^k \sum_{P\in S(Q)}' T(Q/P^\ell, k-\ell).$$

The mark on the sum signifies two things, first we only recurse when $Q/P^\ell$ is a proper monomial including the value one and second, that all monomials are represented by a product

$$A_1^{g_1} A_2^{g_2} \times \cdots \times A_p^{g_p}$$

with the degree sequence $g_1\le g_2\le\cdots\le g_p$ in increasing order. This is so that we may properly memoize the recurrence as the result only depends on the partition induced by the monomials (order of variables does not matter). We can take the recurrence for $T$ and more or less translate it directly into CAS code, we just have to take care of the base cases, the most important of which is that when we have reached only one variable then we get a contribution of one if and only if the degree of the variable equals $k$, and zero otherwise. That is all, if we are after a total count we just add the values for the parameter $k.$

We can use the following special multisets to compute some example values, where there are $n$ elements in the multiset and $m$ copies of each element. For example, with $n=7$ elements and $m=3$ copies of each we have for the possible values starting with $k=3$ and going to $k=21$ the sequence

$$1, 714, 84000, 1737813, 11673597, 35162333, 57789691, 59078859, \\ 41165320, 20857585, 8046164, 2441211, 595456, \\ 118300, 19236, 2541, 266, 21, 1.$$

The reader may want to use this to verify their computations. Another example is for $n=6$ elements and $m=4$ copies of each we get with $k$ rangeing from $k=4$ to $k=24$:

$$1, 201, 18171, 396571, 3053216, 11003801, 22360580, 29114463, \\ 26607981, 18227245, 9816458, 4301588, 1572206, 487670, \\ 129880, 29828, 5901, 995, 140, 15, 1.$$

Another important sanity check is that we should get Stirling numbers of the second kind when $m=1$ and indeed this check also goes through. The Maple code shown below implements three routines. First a plain enumeration routine that can be used to verify correctness of the more sophisticated alternatives. Second, computation by the Polya Enumeration Theorem directly substituting into the cycle index $Z(S_k)$ and third, the recurrence described above. The number of entries in the memoization table for $T$ is given by the value of the partition function summed for up to the total degree of the inital value of $Q.$ This is the code:

with(combinat);

pet_cycleind_symm_maxcl :=
proc(n, m)
option remember;

    if n=0 then return 1; fi;

    expand(1/n*add(a[l]*pet_cycleind_symm_maxcl(n-l, m),
                   l=1..min(m,n)));
end;

pet_varinto_cind :=
proc(poly, ind)
local subs1, subs2, polyvars, indvars, v, pot, res;

    res := ind;

    polyvars := indets(poly);
    indvars := indets(ind);

    for v in indvars do
        pot := op(1, v);

        subs1 :=
        [seq(polyvars[k]=polyvars[k]^pot,
             k=1..nops(polyvars))];

        subs2 := [v=subs(subs1, poly)];

        res := subs(subs2, res);
    od;

    res;
end;

# Polya Enumeration Theorem
T1 :=
proc(n, k, m)
option remember;
local rep, q, p, gf;

    rep := -1 + mul(1+A[q], q=1..n);
    gf := pet_varinto_cind(rep, pet_cycleind_symm_maxcl(k, m));

    for q to n do
        gf := diff(gf, A[q]$m);
        gf := 1/m!*subs(A[q]=0, gf);
    od;

    gf;
end;


# sanity check for small arguments of the parameters
ENUM :=
proc(n, k, m)
option remember;
local mset, allmsets, idx, digits, dix, src, sidx;

    if k=1 then return 1 fi;

    src := [seq(V[q]$m, q=1..n)];
    allmsets := table();

    for idx from k^(m*n) to 2*k^(m*n)-1 do
        digits := convert(idx, `base`, k)[1..m*n];
        if nops(convert(digits, `set`)) = k then
            mset := table([seq(q=1, q=1..k)]);
            for sidx to m*n do
                dix := digits[sidx] + 1;

                if src[sidx] in indets(mset[dix]) then
                    break;
                fi;

                mset[dix] := mset[dix] * src[sidx];
            od;

            if sidx = m*n+1 then
                allmsets[sort([entries(mset, `nolist`)])] := 1;
            fi;
        fi;
    od;

    numelems(allmsets);
end;

dgseq :=
proc(ms)
local vars, var, ds, q;
    vars := indets(ms);
    ds := sort([seq(degree(ms, var), var in vars)]);
    mul(A[q]^ds[q], q=1..nops(vars));
end;


T2gen :=
proc(ms, k)
option remember;
local vars, var, rep, term, res, sbms, ell;

    vars := indets(ms);

    if nops(vars) = 0 and k = 0 then
        return 1;
    elif nops(vars) = 0 or k=0 then
        return 0;
    fi;

    if nops(vars) = 1 then
        if degree(ms) = k then
            return 1;
        else
            return 0;
        fi;
    end;

    rep := -1 + expand(mul(1+var, var in vars));
    res := 0;

    for ell to k do
        for term in rep do
            sbms := ms/term^ell;
            if type(sbms, `polynom`) then
                res := res + T2gen(dgseq(sbms), k-ell);
            fi;
        od;
    od;

    res/k;
end;

T2 :=
proc(n,k,m)
    local q;
    T2gen(mul(A[q]^m, q=1..n), k);
end;