Reconstructing sets of sets from reduced membership information

65 Views Asked by At

Let $[n] = \{1,2,\dots,n\}$ and $\mathcal{P}(n)$ be the power set of $[n]$. Let $\sigma \subseteq \mathcal{P}(n)$ be a set of subsets of $[n]$. For each $\sigma$ one can define the square matrix $\{a_{ik}\}_{i,k =1,\dots,n}$ with

$$a_{ik} = \Big|\big\{s \in \sigma\ \big|\ i \in s \wedge |s| = k \big\}\Big|,$$

i.e. the number of $k$-element sets in $\sigma$ that contain $i$. For fixed $i$, the vector $[a_{i1},\dots,a_{in}]$ gives something like an inverted degree sequence: the element (or node) $i$ is contained in $a_{i2}$ pairs, in $a_{i3}$ triples, in $a_{i4}$ quadruples, and so on. It's obvious that $a_{ik} \leq \binom{n-1}{k-1}$. And I'm quite – but not absolutely – sure that $\sum_i a_{ik} = 0 \text{ mod } k$ (a generalization of the handshaking lemma). For lack of another name , let me call this matrix the stats matrix of $\sigma$. [Side question: Does the so defined matrix have an official name?]

Now let a square matrix $\{a_{ik}\}_{i,k =1,\dots,n}$ with $a_{ik} \leq \binom{n-1}{k-1}$ and $\sum_i a_{ik} = 0 \text{ mod } n$ for each $k$ be given. My question is fourfold:

  1. How can it be checked if there is a $\sigma \subseteq \mathcal{P}(n)$ with stats matrix $\{a_{ik}\}$? What are necessary and/or sufficient conditions?

  2. How can the number of $\sigma$ with stats matrix $\{a_{ik}\}$ be determined? Or maybe the fraction of stats matrices (among all as defined in the premise) in the limit $n \rightarrow \infty$?

  3. How can one (random) $\sigma$ with stats matrix $\{a_{ik}\}$ effectively be constructed?

  4. How can all $\sigma$ with stats matrix $\{a_{ik}\}$ effectively be constructed (to perform some statistics on this set)?

The column $k=2$ reminds me of the configuration model for random graphs, maybe this can help to answer question 3 for general $k$ (which to be honest interests me most).

1

There are 1 best solutions below

1
On

I'm just going to focus on the first question here, since there's nothing to be done if there's no solution to it.

Your problem is really $n$ independent problems: for each $k$, the problem of using $a_{1k}, a_{2k}, \dots, a_{nk}$ to reconstruct all the $k$-element sets in $\sigma$ is solved without needing to look at any other column of the stats matrix. There are several special cases where it's easy to solve:

  1. When $k=1$, $a_{i1}$ just tells us directly whether $\{i\} \in \sigma$, so the problem is trivial.
  2. When $k=2$, we want to know if there is a simple graph with the degree sequence $a_{12}, a_{22}, \dots, a_{n2}$, by thinking of a set $\{i,j\} \in \sigma$ as an edge from $i$ to $j$. This can be done with the Havel–Hakimi algorithm.
  3. The cases $k=n-2$ and $k=n-1$ can be reduced to the cases above by taking complements. The case $k=n$ is also not very interesting: $a_{1n}=\dots=a_{nn}=1$ if $[n] \in \sigma$, and $0$ otherwise.

For $2<k<n-2$, this is a well-studied problem we don't know the answer to; it's usually posed in the language of hypergraphs. A $k$-uniform hypergraph with vertex set $[n]$ is just a family of $k$-subsets of $[n]$ (the hyperedges); the degree of a vertex $i$ is the number of hyperedges containing $i$, which is exactly what $a_{ik}$ counts in this question. Given the sequence $(a_{1k}, a_{2k}, \dots, a_{nk})$, you are asking if there is a $k$-uniform hypergraph with this degree sequence; a sequence for which the answer is yes is called "$k$-graphic".

Michael Ferrara's survey Some problems on graphic sequences mentions this problem in section 4, but as of 2013 when it was written no efficient algorithm to solve it is known. Based on the papers citing this survey, it hasn't been solved since then, either.