What size sample do I need to find errors with a certain amount of confidence

66 Views Asked by At

I work in a corporate audit department. Audits are necessarily limited in time and scope, we cannot investigate everything all the time. My question regards how many records I should examine to get a certain level of comfort, recognizing that I will sometimes miss errors or fraud because I cannot examine everything.

Suppose I have 1,000,000 transactions. Some of them may have errors/fraud, and I do not know how often errors/fraud occur. I want to test enough transactions so that I am 95% sure, or 99% sure, that I am finding at least one erroneous/fraudulent transaction if the incidence of error/fraud is greater than or equal to 1%, or 2%, or N%.

An example. Suppose in those 1,000,000 transactions there are 10,000 erroneous/fraudulent transactions and I care about an incidence rate of more than 1%. 10,000 is more than 1% of 1,000,000, so how many transactions would I need to test to be 95% confident that I am testing at least one of those 10,000 erroneous/fraudulent transactions? To be 99% confident? And same questions if the incidence rate I care about is 2%. (I'm looking for a general formula.)

1

There are 1 best solutions below

1
On BEST ANSWER

I will leave it to you to find a general formula. Here is the solution for one of the specific scenarios you mention in your question.

If the fraud rate is $p = .01$ then you need $n$ large enough that the probability of getting at least one fraudulent item exceeds 95%. Let $X$ be the (binomial) count of fraudulent items. You need $n$ such that $$P(X \ge 1) = 1 - P(X = 0) = 1 - (.99)^n > .95.$$

Using logs, one can see that $n = 299$ suffices, as shown below using R as a calculator.

log(.05)/log(.99)
[1] 298.0729

1 - .99^299
[1] 0.9504637

Or from a 'grid' search in R:

n = 1:1000
min(n[dbinom(0, n, .01)<.05])
[1] 299