This is a curiosity question.
Recently I stumbled across the following problem :
Given three integers $k,m, n$ such that $m+k\leq n$. A friend chooses a subset $S\subseteq\lbrace1,\ldots,N\rbrace$ with $k$ elements, and you have to guess what it is. You can ask him specific questions of the form: for each question you choose a subset $G \subseteq\lbrace1,\ldots,N\rbrace$ with $m$ elements and ask him does it have elements in common with $G$?, and you get an answer "Yes" or "No". How many questions do you need to find the subset?
Attempt
I was working on this question for some time without any breakthrough, let $f(n,m,k)$ be the minimal number of questions needed. I was particularly interested in $f(8,4,4)$. I managed to find a formula for very specific cases for example :
- Obviously $f(n,m,0)=0$
- $f(n,n-1,1)=n-1$ and $f(n,1,k)=n-1$
- $f(n,n-2,1)=f(n,2,1)=\lfloor \frac n 2 \rfloor+1$
- Some complicated formulas for $k=2$ but I am not sure if they are correct nothing for $k\geq 3$.
- It seems that $f(n,m,k)=f(n,n-m,k)$ but I could not prove it.
I added the condition $m+k\leq n$ because sometimes, it's not possible to find the subset (I think it's sufficient to ensure the existence of a solution, but I am not sure if it's necessary ).
Question : Is there an algorithm to solve the problem ? to compute $f(n,m,k)$ ? or just any formulas for $k=3,4$?
We can get asymptotic estimates. For any $m$ and $k$, if $n$ is sufficiently large (say $n > km$), we have $$ \left\lfloor\frac{n-k}m\right\rfloor \le f(n,m,k) \le \left\lfloor \frac nm \right\rfloor + f(km,m,k) $$ so in particular $f(n,m,k) = \frac nm + O(1)$ as $n\to \infty$.
The lower bound is just due to the fact that even if you guess disjoint sets each time, the first $\lfloor\frac{n-k}m\rfloor-1$ answers could be "no" and yet not narrow things down to a single option.
For the upper bound, begin by guessing disjoint intervals of length $m$ until either $k$ answers are "yes", or else $k-1$ answers are "yes" and there's at most $m$ elements remaining. (This takes at most $\left\lfloor \frac nm \right\rfloor$ steps.) Then the union of the $k$ intervals with answers of "yes" is a set of size $km$ that contains all elements of $S$, so we can use the $f(km,m,k)$ strategy on it, which takes a number of guesses independent of $n$. (We might be able to do better here, since we can take advantage of lots of elements we know aren't in $S$, but this is just an upper bound.)