Calculating if a sample variation is down to chance?

32 Views Asked by At

thanks for reading

This is a question that keeps coming up, I'm hoping some amazing person has an idea of how to do it.

I have a small sample from a known population, that sample doesn't quite represent the population by some demographic I might choose. Is there a method I can use to calculate the probability that the difference between the two is purely down to chance?

Example 1: A year group at a school has 300 students, 140 boys, 160 girls (population). They are set into ability groups the lowest ability group (sample) of 25 students is 16 boys and 9 girls. Could that be chance?

Example 2: A large company has an ethnically diverse staff. There's also a committee of 20 people, the ethnic make-up of that committee doesn't quite match up with the staff as a whole. Is it possible to work out the probability that the difference is down to chance?

I've spent ages fishing around for a method but it always ends up being about sample size for confidence in unknown properties of a population - it's not that!

Thanks a lot, Dave C

1

There are 1 best solutions below

0
On

If I understand correctly, it looks like what you want is a goodness of fit test.

There are different goodness of fit tests, but one simple one to consider for your context is Pearson's chi square test. The basic idea here is that our null hypothesis is that a sample was drawn from some population with known frequencies. To test this null hypothesis, we compute a test statistic that intuitively measures relative gaps between observed frequencies and expected frequencies.

This test statistic corresponds with a p-value, which represents the probability of getting a test statistic at least as extreme as the one computed from our data under the null hypothesis being true, and thus can be thought of as representing the probability that the differences are due to chance.