Group Testing Problems Where the Number of Defectives is Given By a Probability Distribution

105 Views Asked by At

I was looking at Wolves and Sheep and had an interesting thought.

The original problem says that out of $100$ sheep, it is known that there are $5$ wolves and they want to minimize the number of tests used. Let's say that instead of giving a discrete number such as $5$ wolves, they said that each sheep has a $10 \%$ chance of secretly being a wolf.

In that case, how would we do the testing? All of the solutions on Wolves and Sheep rely on the fact that there are exactly $5$ wolves.

I wish I had some sort of insight or steps that I had taken thus far to approach this problem, but I simply haven't been able to do anything. I don't get how to solve this whatsoever.

1

There are 1 best solutions below

3
On

You would need a multistage testing and scoring mechanism and only have a probabilistic answer at the end so you only know (say with probability 0.99) you have found all the defectives.

Assuming noiseless group tests, if $T(A)$ for a subset $A$ returns “defect present” you use the conditional probability distribution that the number of defectives in $A$ is at least 1, and score each element equally in terms of likelihood of being defective.

Then apply $T(A’)$ to another subset.

If both $T(A)$ and $T(A’)$ say “defective present” any element belonging to $A \cap A’$ gets a higher score.

You can determine based on $p=0.1$ that the most informative test is to test an untested subset of size $k$ where the probability of “defect absent” is roughly $1/2$. This means $(1-p)^k=1/2$ and you take the nearest integer to determine $k.$ This should in general minimize the total number of tests.

In this vein, Chan et al. (2011) introduced COMP, a probabilistic algorithm that requires no more than $t=ed(1+\delta )\ln(n)$ tests to find up to $d$ defectives in $n$ items with a probability of error no more than $n^{-\delta }$. This is within a constant factor of the $t=O(d\log _{2}n)$ lower bound.

See wikipedia for more, the above quote is from there in the “non adaptive and probabilistic testing” section.