Determine an appropriate size of sample

42 Views Asked by At

Let's say I have a pool of $N$ balls, which can be of $n$ colors $A_1, \cdots, A_n$.

$N$ is much bigger than $n$.

What number of balls must I draw if I want to have a good estimate of $R_1, \cdots, R_n$ the ratios for each color in the pool ?

1

There are 1 best solutions below

2
On

Some ideas and hints, lets assume the sampling is without replacement. Then the number of balls of color $A_j$ is $NR_j$, and let in a sample size $M$ $X=(X_1, \dots, X_n)$ be the number of balls drawn with each color. Then the distribution of $X$ is hypergeometric $$ P(X_1=x_1, \dots, X_n=x_n) = \frac{\binom{NR_1}{x_1}\dots \binom{NR_n}{x_n}}{\binom{N}{M}} $$ Note that we have $M = x_1 + \dots x_n$.

You can estimate the fraction $R_j$ by $\frac{x_j}{M}$, and calculate its variance. That I leave for you, and instead uperbound the variance with what would have been the variance if sampling was with replacement, which is a much simpler expression. That gives $$ \text{Var}(\hat{R_j}) \le R_j (1-R_j)/M \le \frac1{4M} $$ where we have used the inequality $x(1-x) \le \frac14$ for $0 \le x \le 1$. Then you can choose $M$ from your requirements of the precision of estimation! We could have done much better if some initial sampling were allowed to get a preliminary estimate of the fractions $R_j$.