How can I calculate if my data is diverse

33 Views Asked by At

I conducted a survey in which the first question I asked students what year they were in. I want to check that I have a diverse amount of students (from many year groups) so I can say the survey data represents the population of all students.

the results look like this: graph results the data looks like this :

year 1: 2 people 
year 2: 10 people 
year 3: 8 people 
year 4: 5 people
year 5+: 0 people

There were 25 respondents in total and there were 5 options (of year groups) for them to choose from.

From my data is it fair to say I have a varied range of respondents across year groups? Is it fair to say the rest of the data reflects all students across year groups?

1

There are 1 best solutions below

1
On

Fair warning: I am not a statistician, so please consult one if you really need exact answers on this question. However, I think what follows isn't bad advice.

I think that if you really need to ensure that you have sampled "fairly" from each year for whatever your survey is about then you should probably go back and re-survey using some kind of stratified sampling technique. Otherwise you are probably going to run into the trap of "forcing" your data to be "fair" by choosing some clever ad-hoc definition, which might actually invalidate your results.

If you can't afford to re-survey using an appropriate technique, the next best thing you could do is simply be honest about your sampling methodology and move forward by ensuring that you refer only to the entire student population, without regard to year. Don't try and draw any conclusions about individual years since you didn't set up the survey correctly.