Minimum number of $k$-partitions of a set of size $n$ to enumerate all $n \choose k$ combinations

493 Views Asked by At

Given a set $\mathcal{S}$ of size $n$, let a $k$-partition of $\mathcal{S}$ be a grouping into $k$ disjoint classes $$(S_1 ,S_2,...,S_k)$$

Where the $S_i$ do not necessarily contain the same number of elements.

Let $C(S_1, S_2, ... S_k)$ be the set of combinations which can be made by taking exactly one element of each class, (i.e. $C(S_1, S_2, ... S_k)$ is the Cartesian Product $S_1 \times S_2 \times ...\times S_k$). For a given $(n,k)$, what is the minimum number of k-partitions $N$ such that

$\bigcup\limits_{i=1}^{N} C(S_1^{(i)}, S_2^{(i)}, ...,S_3^{(i)}) = $ All Possible Combinations

Example:

for $n=4$ and $k=2$, $\mathcal{S} = \{1,2,3,4\}$ we can choose:

$$ S^{(1)} = \left\{\left\{ 1, 2\right\}, \left\{ 3, 4\right\}\right\}$$ $$ S^{(2)} = \left\{\left\{ 1, 3\right\}, \left\{ 2, 4\right\}\right\}$$

Then $$C_{1} = \left\{\left\{ 1, 3\right\} ,\left\{ 1, 4\right\},\left\{ 2, 3\right\},\left\{ 2, 4\right\}\right\}$$ $$C_{2} = \left\{\left\{ 1, 2\right\} ,\left\{ 1, 4\right\},\left\{ 2, 3\right\},\left\{ 3, 4\right\}\right\}$$

and $C_1 \cup C_2 =$ All possible combinations and $N=2$.

2

There are 2 best solutions below

9
On BEST ANSWER

When $k=2$ we have $N = \left\lceil \log_2 n\right\rceil$. We can interpret the problem as covering a complete graph $K_n$ with $n$ bipartite graphs, where the $i^{\text{th}}$ bipartite graph contains all the edges between $S_1^{(i)}$ and $S_2^{(i)}$. This was previously asked on MSE here. The lower bound $N \ge \log_2 n$ is the same as the general one I'll give below. The upper bound is obtained by letting $S_1^{(i)}$ be the set of all elements of $\{1,2,\dots,n\}$ whose binary representation has a $0$ in the $i^{\text{th}}$ position, and letting $S_2^{(i)}$ be the complement. (This takes $\mathcal S = \{1,2,\dots,n\}$.)

For general $k$, we have $N \ge \log_b \frac n{k-1}$ where $b = \frac{k}{k-1}$. To prove this, we induct on $n$. This is true for $n < k$, since at that point the lower bound we get is nonpositive.

No matter how we choose $(S_1^{(1)}, \dots, S_k^{(1)})$, we still have:

  1. Each of the sets $\mathcal S - S_1^{(1)}, \dots, \mathcal S - S_k^{(1)}$ has no $k$-element subsets found in $C(S_1^{(1)}, \dots, S_k^{(1)})$.
  2. One of these sets has size at least $\frac{k-1}{k} \cdot n = \frac n b$, because one of the sets $S_i^{(1)}$ has size at most $n/k$.

So passing to a set $\mathcal S' = \mathcal S - S_i^{(1)}$ with both properties, we see that we've obtained none of the $k$-element subsets of $\mathcal S'$ so far. By the inductive hypothesis, we still need $\log_b \frac{n/b}{k-1} = \log_b \frac{n}{k-1} - 1$ more $k$-partitions to see all the $k$-element subsets of $\mathcal S'$; together with the $k$-partition we already used, we need $\log_b \frac{n}{k-1}$ total, as desired.

This gives us the bound $N \ge \log_2 n$ when $k=2$, but I can't think of a matching construction when $k \ge 3$, so I don't know if the bound is tight in general.


For an upper bound, we can take a random construction. Choose a random $k$-tuple $(S_1, S_2, \dots, S_k)$ of subsets of $\mathcal S$ by adding each element of $\mathcal S$ to a uniformly random $S_i$. If we do this, then the probability that a given $k$-set is contained in $C(S_1, S_2, \dots, S_k)$ is $\frac{k!}{k^k}$.

If we now take $N$ random $k$-tuples of this form, the probability that a given $k$-set is never covered is $(1 - \frac{k!}{k^k})^N$, so the expected number of uncovered $k$-sets is $\binom nk (1 - \frac{k!}{k^k})^N$. All we need to do is choose $N$ so that this expected value is less than $1$, and we can conclude that with positive probability, the random construction covers all $k$-sets.

Because $1 - x \le e^{-x}$ for all $x$, we have $\binom nk (1 - \frac{k!}{k^k})^N \le \binom nk e^{-N k!/k^k}$, so we can choose $N = \frac{k^k}{k!} \ln \binom nk$. For fixed $k$, this gives a logarithmic upper bound (in $n$) to match the logarithmic lower bound - but the lower bound $\log_b \frac{n}{k-1}$ is roughly $k \ln n$, whereas this upper bound is roughly $k e^k \ln n$, so the constants are pretty far apart.

0
On

Rather than $N$ I'm going to use the OEIS convention $T(n, k)$ (for triangle, I think).

Misha Lavrov's answer gives the bound $$T(n, k) \ge \frac{\log n - \log(k-1)}{\log k - \log(k-1)}$$

Another, easier, lower bound uses the auxiliary function A152072 which gives the largest possible size of the Cartesian product $S_1 \times S_2 \times \cdots \times S_k$: then clearly $$T(n, k) \ge \frac{\binom n k}{\textrm{A152072}(n, k)} = \frac{\binom n k}{ \left\lceil n/k \right\rceil^{n \bmod k} \left\lfloor n/k\right\rfloor^{k-n \bmod k}}$$

Neither of these bounds is always better than the other. In fact, we can go further. Note that when $k = n-1$ Misha's bound is $2$, whereas my bound is $\left\lceil\frac n2\right\rceil$ (and this is tight: consider $S_1^{(i)} = \{2i, (2i+1) \bmod n\}$, all other $S_j^{(i)}$ are singletons). On the other hand, when $k=2$ Misha's bound is $\left\lceil \log_2 n\right\rceil$, whereas my bound is $2$. Therefore neither bound is asymptotically tight.


Looking at actual figures, we have (starting at $n=1$, $k=1$)

$$\begin{matrix}1 \\ 1& 1 \\ 1& 2& 1 \\ 1& 2& 2& 1 \\ 1& 3& 3& 2& 1 \\ 1& 3& 3& 3& 2& 1 \\ 1& 3& 4& 3& 3& 2& 1 \\ 1& 3& 4& 4& 4& 3& 2& 1\end{matrix} \textrm{ vs }\quad \begin{matrix}1 \\ 1& 1 \\ 1& 2& 1 \\ 1& 2& 2& 1 \\ 1& 2& 3& 3& 1 \\ 1& 2& 3& 4& 3& 1 \\ 1& 2& 3& 5& 6& 4& 1 \\ 1& 2& 4& 5& 7& 7& 4& 1 \end{matrix}$$ for a combined lower bound of $$\begin{matrix}1 \\ 1& 1 \\ 1& 2& 1 \\ 1& 2& 2& 1 \\ 1& 3& 3& 3& 1 \\ 1& 3& 3& 4& 3& 1 \\ 1& 3& 4& 5& 6& 4& 1 \\ 1& 3& 4& 5& 7& 7& 4& 1\end{matrix}$$


By brute force, the actual start of the table should be

$$\begin{matrix}1 \\ 1& 1 \\ 1& 2& 1 \\ 1& 2& 2& 1 \\ 1& 3& 3& 3& 1 \\ 1& 3& 3& 5& 3& 1 \\ 1& 3& 4& 6& 6& 4& 1 \\ 1& 3& 4& 6& 8& 8& 4& 1\end{matrix}$$