Sampling without replacement from a population with known fractions of duplicates, triplicates, etc.

43 Views Asked by At

Suppose I have a population of a known size. For this population, p_1 + p_2 + .... + p_n = 1, where p_1 is the fraction of the population that has no duplicates, p_2 is the fraction of the population with one duplicate, and so on. p_1, p_2, and so on are all known values (and if it matters for approximate calculations, the fractions very quickly decrease in value, with anything above p_5 being negligible).

I take a random sample X (no replacement) from this population, where X << the total population size, and I want to find the expected fraction of X that will not be a unique item.

How do I proceed? Please forgive any confusing language above, I am attempting to translate a question into a math problem with no background in statistics.