How to calculate sampling error?

Question

How to calculate sampling error?

394 Views Asked by Bumbble Comm At 29 Mar 2026 - 5:23

Given a reservoir of size $S$ with each element taking a value of error or not an error, we attempt to estimate the number of errors inside the reservoir through the following

We poll the reservoir with $P$ samples, and verify that each sample is not an error (note this is ad hoc observed), we do this polling process $X$ trials

since the probability of $n$ error (set to some percentage of $S$ ) appearing in a trial is

$$ \approx {n \choose P} (\frac{n}{S})^n (1 - (\frac{n}{S})^{P- n}) $$

If the probability that no error appear in $X$ trials is low assuming there are $n$ errors in the reservoir, than we can assume that our observation of no errors guarantee that $n$ is fairly few in the resevoir

$$ P(no \:errors \: despite \: n \: errors \: exist) \approx (1 - \sum_{n= 1}^{n}{n \choose P} (\frac{n}{S})^n (1 - (\frac{n}{S})^{P- n}))^X$$

If $P(no \:errors \: despite \: n \: errors \: exist) << 1$ than the fact that we observe no errors means that $n$ is few

but what I found out is that

$$P(no \:errors \: despite \: n \: errors \: exist) = 1$$

wolfram calculation

This is counter intuitive, since if we set $n = 350$ , polling an error out of the reservoir of size $S = 70000$ with $P = 1$ have a probability of $0.005$ so polling at least 1 error out of 500 polls must be greater than $0.005$

Can someone point out where I made a mistake?

Original Q&A

There are 1 best solutions below

**user140541** · Accepted Answer

If I correctly understand the described procedure we have a population of size $S$ and the proportion of errors in the population is $p=n/S$. To estimate $p$ (or $n$) you repeat $M$ times (with replacement) a random sampling of $P$ items (without replacement) and count the number of errors in each sample.

Let $\{X_i,\dots,X_M\}$ denote the number of errors in each sample of size $P$. Then assuming $S$ large enough relative to $P$

$$P\{X_i=k\}\approx \binom{P}{k}p^k(1-p)^{P-k}$$

and

$$P\{X_1=0,\dots,X_M=0\}\approx (1-p)^{P\times M}$$

You may want to estimate $p$ from the data $\{X_1,\dots,X_M\}$, e.g. using maximum likelihood. In this case (using binomial approximation again)

$$\ln\mathcal{L}(p|X_1,\dots,X_M)=\sum_{i=1}^M\binom{P}{X_i}+\ln p\sum_{i=1}^MX_i+\ln(1-p)\sum_{i=1}^M(P-X_i)$$

Maximizing $\ln\mathcal{L}$ over $p$ yields the MLE

$$\hat p=\frac{1}{M}\sum_{i=1}^M\frac{X_i}{P}$$

Now you can test (statistically) whether $p$ (or $n$) is close to $0$ or not.

How to calculate sampling error?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in PROBABILITY-THEORY

Related Questions in SAMPLING

Trending Questions

Popular # Hahtags

Popular Questions