Finding the population size from the common elements of multiple samples

180 Views Asked by At

A coworker of mine has taken a certification exam twice and has seen the same ten questions on both exams. I am curious if it is possible to determine the total number of questions in the question pool from the number of questions per exam and the number questions common to both exams.

The exam has fifty-one questions, and in taking it twice for a total of one hundred and two questions, it was determined that ten of the questions appeared on both exams. Also, I am curious if this is even enough information to determine the answer.

Mathematics has never been my strong point.

Thank you!

1

There are 1 best solutions below

0
On BEST ANSWER

On the face of it, this is a capture-recapture question.

You could argue that 10/51 of the second set of questions had been in the first set, and so assume that with a random selection of questions from a fixed population, a reasonable central estimate is that the same fraction of the whole population was in the first set, i.e. $\hat{n}_\text{pop}\times \frac{10}{51} = 51$, giving an estimate of the whole population size of 260.1. Obviously it will not be exactly that (it is not an integer), but it is broadly indicative of the actual figure if the assumptions are correct.