Union of sets with pairwise intersection having half of the elements

Question

Union of sets with pairwise intersection having half of the elements

1k Views Asked by Bumbble Comm At 10 May 2026 - 4:01

Consider $k$ sets $S_1, S_2, \ldots, S_k$, of the following properties:

For every $i$, $\left|S_i\right| = p$
For every pair of $i$ and $j$, $\left|S_i \cap S_j\right| = \frac{p}{2}$

Now I need to find the minimum of $\left|\bigcup S_i\right|$ in terms of $k$ and $p$.

(Assume $p$ is large enough to be divisible by any small enough integers)

Via trial and error I suspect the answer to be $$\frac{\left(2^{m + 1} - 1\right)p}{2^m}$$ where $m = \lfloor\log_2 k\rfloor$, but cannot prove it.

That's also why I give that assumption about $p$, so that one do not need to worry about the situation where $\frac{p}{2^m}$ is not an integer.

Edition: actually, I verified that my suspected answer is wrong. For example, when $k = 5$, I managed to construct the sets using $\frac{5p}{6}$ elements. Now I am not sure what the actual answer should be. :(

Correction: I think I had mistaken $p$ with another variable I used in a larger problem where this proof is required, when giving the $\frac{5p}{6}$ lower bound in the edition above. It should be $\frac{5p}{3}$.

An example would be when $k = 5$ and $p = 6$ (to be divisible by both $2$ and $3$), then my $5$ sets would be $$\left\{1, 2, 3, 4, 5, 6\right\}, \left\{1, 2, 3, 7, 8, 9\right\}, \left\{1, 4, 6, 7, 9, 10\right\}, \left\{2, 4, 5, 7, 8, 10\right\}, \left\{3, 5, 6, 8, 9, 10\right\}$$ using $10$ elements. They are obtained basically as I try to "average" the occurrences of each number (in this example, specifically, letting each number appear $3$ times). Maybe we can go from here, but how to prove in general that this average is reachable?

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2018-09-21 18:45:34

Not a full answer (or perhaps: answering a different question)

Discussion: For shorthand, write $U = |\bigcup S_i|$. Let $f(k,p)$ denote the minimum $U$, as a function of both $k$ (integer) and $p$ (even integer). There is no decomposition $f(k,p) = g(k) \cdot p$ that holds exactly for all $k,p$. (E.g. if $p=2, k=5$, the only solution I can think of has $U = 6$ where every set has 1 element unique to itself and 1 element common to every set, so $g(5) = 6/2=3$. Meanwhile the OP showed an example for $p=6, k=5, U=10$ so $g(5) = 10/6 = 5/3$.) For that reason, I also find it strange to say lets "assume" $p$ is large and highly composite... unless what you are actually interested in minimizing the ratio $U/p$, e.g. for large $p$, or for optimal $p$. E.g., were you perhaps interested in $h(k) = \min_p {f(k,p) \over p}$, i.e. the minimum ratio (that holds for some $p$, i.e. minimized over all possible even integers $p$)?

Anyway, depending on whether you want to minimize universe size $U$, or minimize the ratio $U/p$, the answers may be different. The rest of this post shows 2 separate families of solutions, one better at minimizing $U$ and the other better at minimizing $U/p$.

Family 1: based on $d$-bit binary vectors

Consider $d$-bit binary vectors $v \in \{0,1\}^d$. Let $D=\{1, 2, ..., d\}$ denote the bit positions. Let $T$ be a non-empty subset of $D$. There are $2^d - 1$ such subsets, which we name $T_1, T_2, ..., T_{2^d -1}$.

Define $S_i = \{v \in \{0,1\}^d: \sum_{t \in T_i} v_t = 1 \pmod 2\}$. I.e., for any vector $v$, sum up (modulo 2) all the bits $v_t$ at positions $t \in T_i$, and if the sum is $1$ then the vector belongs in $S_i$, otherwise it doesnt. E.g. for $T_i = \{2,3,5\}$, any vector $(*,1,1,*,1,*,*) \in S_i$ but any vector $(*,1,0,*,1,*,*) \notin S_i$, where $*$ denotes wildcard ($0$ or $1$).
For any $i$, it is obvious that exactly 1/2 of all vectors are in $S_i$, i.e. $|S_i| = 2^{d-1} = p$.
I think it's also true (although I cannot think of a short proof) that for any $i, j$, exactly 1/4 of all vectors are in $S_i \cap S_j$, i.e. $|S_i \cap S_j| = 2^{d-2} = p/2$.

Thus the sets $S_i$ meet the OP's requirements. There are $2^d - 1$ such sets, and of course you don't have to use all of them. In other words we have constructed examples, parameterized by $d$, where:

individual set size $p = 2^{d-1}$
no. of sets $k \le 2^d - 1$
universe size $U = 2^d - 1$, because every vector is in the union EXCEPT the zero vector
ratio $U/p = {2^d - 1 \over 2^{d-1}}$, akin to the original formula the OP suggested.

Family 2: based on $m$-choose-$n$

Let me now exhibit a different family of solutions generalizing the OP's $k=5, p=6$ example.

Suppose you have $m$ distinct objects $\{1, 2, ..., m\}$ and you choose $n<m$ of them. There are of course $L={m \choose n}$ ways to do this, i.e. there are $L$ such size-$n$ subsets. Order these $L$ subsets lexicographically and call them $T_1, T_2, ... T_L$. E.g. if $n=3$ then $T_1 = \{1,2,3\}, T_2 = \{1,2,4\}, ..., T_L = \{m-2, m-1, m\}$, etc. Now form the following $m\times L$ matrix $A$:

$A_{ij}=1$ if object $i \in T_j$.
$A_{ij}=0$ otherwise.

I.e. the $j$th column of $A$ is the Boolean vector for membership in $T_j$. Now switch your perspective and...

Identify the rows of $A$ as the OP's $S_i$ sets.
Identify the set of columns as the universe $\bigcup S_i$, and in particular, $L = |\bigcup S_i| = U$.

The following statements hold:

Each row contains the same number of $1$s: indeed, $\forall i, \sum_j A_{ij} = {m-1 \choose n-1}$, independent of $i$. This is the number of subsets $T_j$ that contains $i$, which equals the number of ways to choose the remaining $n-1$ elements out of the remaining $m-1$ possibilities. Since we identify each row $i$ as the OP's set $S_i$, this means $|S_i| = p = {m-1 \choose n-1}$.
For any two rows $i, i'$, their intersection contains the same number of $1$s: indeed, $\forall i, i', \sum_j (A_{ij}A_{i'j}) = {m-2 \choose n-2}$, independent of $i, i'$. This is the number of subsets $T_j$ that contains both $i$ and $i'$, which equals the number of ways to choose the remaining $n-2$ elements out of the remaining $m-2$ possibilities. Since we identify rows $i, i'$ as the OP's sets $S_i, S_{i'}$, this means $|S_i \cap S_{i'}| = p/2 = {m-2 \choose n-2}$.
Now we solve: ${m-2 \choose n-2} = {1 \over 2} {m-1 \choose n-1} \implies n = 1 + {m-1 \over 2} = {m+1 \over 2}$.

In other words, parameterized by odd integer $m$, we can define $n = {m+1 \over 2}$ and we have exhibited a collection of sets (rows of $A$) where:

individual set size $p={m-1 \choose n-1}$
no. of sets $k \le m$
universe size $U = L = {m \choose n}$, the no. of columns of $A$
ratio $U/p = {m \choose n} / {m-1 \choose n-1} = m/n = {2m \over m+1}$

Examples and comparisons:

The follow table lists, for several choices of $k$, the "best" examples offered by each family and the resulting $p, U, U/p$. (In each family, the best is obtained by taking the smallest $d$ or $m$ possible.)

                Family 1             Family 2
                --------             --------
     k     d   p   U   U/p      m   n   p   U   U/p
   ---     ---------------      -------------------
     3     2   2   3   3/2      3   2   2   3   3/2
   4,5     3   4   7   7/4      5   3   6  10   5/3
   6,7     3   4   7   7/4      7   4  20  35   7/4
   8,9     4   8  15  15/8      9   5  70 126   9/5
  10,11    4   8  15  15/8     11   6 252 462  11/6
  12,13    4   8  15  15/8     13   7   .   .  13/7
  14,15    4   8  15  15/8     15   8   .   .  15/8

As can be seen, Family 1 generally has much smaller $U$ (except equality when $k=3$), but Family 2 has smaller $U/p$ ratio (except equality when $k=3, 6,7, 14, 15$, and if I may extrapolate: $k= 30, 31, 62, 63,$ etc)

**Bumbble Comm** · Answer 2 · 2018-09-21 18:56:13

Some estimation:

Let $$M:=\bigcup S_i = \{1,2,...,n\}$$ Let $d_i$ be a number of sets that $i\in M$ is in them. Then we have $$\sum_{i=1}^n d_i =k\cdot p$$ and $${k\choose 2}{p\over 2} = \sum _{i=1}^n{d_i\choose 2} $$

By Jensen we have:

$$\sum _{i=1}^n{d_i\choose 2} \geq {{1\over n}(\sum d_i)^2-(\sum d_i)\over 2}$$

so we have $$ {k\choose 2}{p\over 2}\geq {{1\over n}(kp)^2-(kp)\over 2}$$ and thus $${k-1\over 2}\geq {1\over n}kp-1$$ so $$n\geq {2kp\over k+1}$$

But I don't know how to find a configuration with $n=\Big[{2kp\over k+1}\Big]$. However, for odd $k$ equality is achieved if $d_1=d_2=...=d_n = {k+1\over 2}$

Union of sets with pairwise intersection having half of the elements

There are 2 best solutions below

Related Questions in COMBINATORICS

Related Questions in ELEMENTARY-SET-THEORY

Related Questions in COMBINATORIAL-DESIGNS

Trending Questions

Popular # Hahtags

Popular Questions