I am interested in the mathematics behind a simplified model of test pooling.
Suppose we have a large population of people, each of whom independently has a certain disease with known probability $p\in (0,1)$. We have a perfectly accurate test for the disease, and samples can be pooled: given any group of people, we can perform a single test to determine whether or not at least one member of the group has the disease. We wish to determine the precise set of infected individuals, using a testing strategy that minimizes the expected number of tests.
Let $T(p)$ be the expected number of tests per person using the optimal testing strategy, in the limit as the population goes to infinity.
What can we say about the function $T(p)$? Is there a formula for $T(p)$?
Some bounds:
$T(p)\leq 1$, since we could test everyone individually,
$T(p)\geq p$, since each infected person needs to be tested in at least one pool containing no other infected person,
$T(p)\geq -p\log_2(p)-(1-p)\log_2(1-p)$, since each test provides at most $1$ bit of information.
Suppose we pool $n$ samples for a test. The probability the pool tests positive is $1-(1-p)^n$, and each member of the positive pool has the disease with probability $p/(1-(1-p)^n)$. It follows that $$ T(p) \leq \inf_{n\geq 2}\left[\frac{1}{n}+(1-(1-p)^n)T\left(\frac{p}{1-(1-p)^n}\right)\right]. $$
I have not been able to find any literature on this problem, but I would not be surprised if there is some, so I included the [reference-request] tag.