How many random draws to I need to make out of N numbers to estimate proportion of population values?

39 Views Asked by At

Suppose I have $N$ different numbers. Each of the $N$ numbers can be a $0$, a $2$ or a $4$, and I want to find out the proportion of $0$s, $2$s and $4$s. Suppose that it's unfeasible for me to look at every single one of my $N$ numbers. Instead, I want to rely on "sampling" randomly from them. So, if I draw 10 samples and get four $0$s, four $2$s and two $4$s, my estimated proportions are:

$$ p_0 = 0.4, p_2 = 0.4, p_4 = 0.2 $$

My question is, how many random samples do I need to be reasonably sure that I'm accurately representing the distribution of all $N$ numbers?

EDIT:

I think some may have misunderstood what it is exactly that I'm after. To clarify, what I seek is an estimate of the true distributions of the numbers $0$, $2$ and $4$, and I wonder how many samples I need to draw from the N members of my population to make sure that I'm not too far off.

1

There are 1 best solutions below

0
On

Given a sample size of $n$, the expected number of times for 0.4 to appear is $\mu=np_0$, where $p_0$ is the true probability for 0.4 to appear. This comes with a variance $\sigma^2=np_0(1-p_0)$, or SD of $\sigma=\sqrt{np_0(1-p_0)}$.

Therefore, your estimation of the probability $p_0$ has an SD of $\frac{\sigma}{n}=\frac{\sqrt{p_0(1-p_0)}}{\sqrt{n}}$.

Note that this scales inversely with $\sqrt{n}$, so the larger the sample size, the smaller the uncertainty.