Separating a set with pairs of subsets whose size is constrained

Question

Separating a set with pairs of subsets whose size is constrained

65 Views Asked by Bumbble Comm At 26 Mar 2026 - 1:01

There are two problems I am trying to solve. The first is a special case of the second, but I shall include it anyway as it possibly provides some insight.

Let $0<\lambda\le 1$ and let $X$ be a set of $n$ elements. Fot $i=1,2,\dots ,m$, let $A_i$ and $B_i$ be disjoint subsets of $X$, such that $$\mid A_i\mid +\mid B_i\mid\le \lambda n$$ for every $i$. We say that the collection $$\Sigma=\lbrace (A_i,B_i):i=1,\dots,m\rbrace\subseteq \mathcal{P}(X)\times\mathcal{P}(X) $$ of pairs $separates$ the elements of $X$ if for all $x,y\in X$ with $ x\ne y$, there is some index $i$ such that $x$ is in one of $A_i$ or $B_i$, and $y$ is in the other. Show that if $X$ separates the elements of $X$ then $$\mid \Sigma \mid = m\ge \frac{\log_2n}{\lambda}$$

My attempt: I think I should be able to provide a probabilistic argument.

Suppose $\Sigma$ separates the elements of $X$. Then, consider deleting either $A_1$ or $B_1$ from $X$ at random, and then continuing be deleting $C_i=A_i$ or $B_i$ from what remains for all $i\le m$. We are left with (at most) a single element (else $\Sigma$ does not separate the elements of $X$).

The expected proportion of elements removed at each step is $\frac{\lambda}{2}$, and so the expected number of elements remaining is $(\frac{\lambda}{2})^mn$. Realising that this is at most 1, we get a relation similar to what we are tring to show, but the $\lambda$ is in the $\log$, giving a weaker bound.

The second question relaxes the condition $$\mid A_i\mid +\mid B_i\mid\le \lambda n$$ and replaces it with the condition $$\Sigma_{i=1}^m (\mid A_i\mid +\mid B_i)\mid\le \lambda mn$$

That is instead of the each pair having a $\lambda n$ elements, we just have that each pair on average has at most $\lambda n$ elements. I actually don't think this changes the result from part 1.

Thanks

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Information theory background

The basic tool of this approach is the entropy function: given a random variable $\mathbf X$ with values $\{x_1, \dots, x_n\}$, its entropy is $$ H(\mathbf X) = \sum_{i=1}^n \Pr[\mathbf X = x_i] \log_2 \frac1{\Pr[\mathbf X = x_i]}. $$ Intuitively, this measures the average amount of bits it takes to transmit $\mathbf X$ by an optimal encoding (up to technical details). These notes summarize the notions we will need for this solution:

The joint entropy $H(\mathbf X, \mathbf Y)$ and conditional entropy $H(\mathbf X \mid \mathbf Y)$.
Subadditivity of joint entropy (Lemma 5 in the notes).
The chain rule for conditional entropy (Lemma 2 in the notes).
Entropy of a uniform random variable (the first example in the notes).
Conditional entropy for conditionally independent random variables (not in the notes, but follows from staring at the definitions for five minutes).

The Wikipedia links are also a fine reference.

I will write $h(q)$ for the function $q \log_2 \frac1q + (1-q) \log_2 \frac1{1-q}$: the entropy of a Bernoulli variable with probability $q$. (Conventionally but irrelevantly for us, $h(0) = h(1) = 0$.)

Solution

Choose a random element $\mathbf x \in X$, and for each $i = 1, \dots, m$, choose a bit $\mathbf S_i$ as follows:

If $\mathbf x \in A_i$, then $\mathbf S_i = 1$.
If $\mathbf x \in B_i$, then $\mathbf S_i = 0$.
Otherwise, $\mathbf S_i$ is chosen at random with $\Pr[\mathbf S_i = 1] = \frac{|A_i|}{|A_i|+|B_i|}$.

The following observations about these random variables will be relevant:

Knowing $(\mathbf S_1, \dots, \mathbf S_m)$ completely determines $\mathbf x$. This is because $\Sigma$ separates $X$: for $x,y \in X$, there will be an $i$ such that $x \in A_i$ and $y \in B_i$, so $\mathbf S_i$ will definitely be different if $\mathbf x = x$ compared to if $\mathbf x = y$.
The coordinates $(\mathbf S_1, \dots, \mathbf S_m)$ are conditionally independent given $\mathbf x$. Knowing $\mathbf x$ tells you which of the $\mathbf S_i$ are determined entirely and which are resampled at random, but then they are resampled independently.

In the language of information theory, the first observation tells us that $H(\mathbf S_1, \dots, \mathbf S_m) = H(\mathbf S_1, \dots, \mathbf S_m, \mathbf x)$; the second observation tells us that $H(\mathbf S_i \mid \mathbf S_1, \dots, \mathbf S_{i-1}, \mathbf x) = H(\mathbf S_i \mid \mathbf x)$.

So we have \begin{align} H(\mathbf S_1, \dots, \mathbf S_m) &= H(\mathbf S_1, \dots, \mathbf S_m, \mathbf x) && \text{Obs. 1}\\ &= H(\mathbf x) + \sum_{i=1}^m H(\mathbf S_i \mid \mathbf S_1, \dots, \mathbf S_{i-1}, \mathbf x) && \text{Chain rule} \\ &= H(\mathbf x) + \sum_{i=1}^m H(\mathbf S_i \mid \mathbf x). && \text{Obs. 2} \end{align} On the other hand, $H(\mathbf S_1, \dots, \mathbf S_m) \le \sum_{i=1}^m H(\mathbf S_i)$ by the subadditivity of joint entropy, so we get $$ H(\mathbf x) \le \sum_{i=1}^m \left(H(\mathbf S_i) - H(\mathbf S_i \mid \mathbf x)\right). $$ We can put an upper bound on each $H(\mathbf S_i) - H(\mathbf S_i \mid \mathbf x)$ individually. We have $H(\mathbf S_i) = h(\frac{|A_i|}{|A_i| + |B_i|})$, because $\mathbf S_i$ is a Bernoulli random variable with that success probability. Meanwhile, $H(\mathbf S_i \mid \mathbf x) = h(\frac{|A_i|}{|A_i| + |B_i|}) \cdot \frac{n - |A_i| - |B_i|}{n}$: with probability $\frac{n - |A_i| - |B_i|}{n}$, $\mathbf x \notin A_i \cup B_i$, in which case $\mathbf S_i$ is randomly chosen with probability $\frac{|A_i|}{|A_i| + |B_i|}$; for other values of $\mathbf x$, $\mathbf S_i$ is constant. Therefore $$ H(\mathbf S_i) - H(\mathbf S_i \mid \mathbf x) = h(\tfrac{|A_i|}{|A_i| + |B_i|}) \cdot \frac{|A_i| + |B_i|}{n} \le \frac{|A_i| + |B_i|}{n} $$ by the bound $h(q) \le 1$ for all $q \in [0,1]$.

We can also compute $H(\mathbf x) = \log_2 n$, since $\mathbf x$ is uniform with $n$ outcomes. So we have $$ \log_2 n \le \sum_{i=1}^m \left(\frac{|A_i| + |B_i|}{n}\right) \le \lambda m $$ which means $m \ge \frac{\log_2 n}{\lambda}$, as desired.

Separating a set with pairs of subsets whose size is constrained

There are 1 best solutions below

Information theory background

Solution

Related Questions in COMBINATORICS

Related Questions in PROBABILISTIC-METHOD

Trending Questions

Popular # Hahtags

Popular Questions