Optimal way to pool tests

103 Views Asked by At

I am interested in the mathematics behind a simplified model of test pooling.

Suppose we have a large population of people, each of whom independently has a certain disease with known probability $p\in (0,1)$. We have a perfectly accurate test for the disease, and samples can be pooled: given any group of people, we can perform a single test to determine whether or not at least one member of the group has the disease. We wish to determine the precise set of infected individuals, using a testing strategy that minimizes the expected number of tests.

Let $T(p)$ be the expected number of tests per person using the optimal testing strategy, in the limit as the population goes to infinity.

What can we say about the function $T(p)$? Is there a formula for $T(p)$?

Some bounds:

  • $T(p)\leq 1$, since we could test everyone individually,

  • $T(p)\geq p$, since each infected person needs to be tested in at least one pool containing no other infected person,

  • $T(p)\geq -p\log_2(p)-(1-p)\log_2(1-p)$, since each test provides at most $1$ bit of information.

  • Suppose we pool $n$ samples for a test. The probability the pool tests positive is $1-(1-p)^n$, and each member of the positive pool has the disease with probability $p/(1-(1-p)^n)$. It follows that $$ T(p) \leq \inf_{n\geq 2}\left[\frac{1}{n}+(1-(1-p)^n)T\left(\frac{p}{1-(1-p)^n}\right)\right]. $$

I have not been able to find any literature on this problem, but I would not be surprised if there is some, so I included the [reference-request] tag.