I'm struggling with the following problem. I have a table showing the % of a population that like (say) bananas in three locations, the population of each location, and the total population who like bananas in each location (the previous two columned multiplied by each other).
[Column 1] % who like bananas: (1) 13% (2) 11% (3) 17%
[Column 2] Population: (1) 100 (2) 125 (3) 90
[Column 3] Pop who like bananas: (1) 13 (2) 13.75 (3) 15.3
When you sum the third column you get a total of 42.05 who like bananas.
But if you average the % of people shown in the first column you get 13.67%. Multiply this by the sum of the population (315) and you get 43.05, significantly higher than the sum of the third column.
It seems using the average for the three rows skews the final result compared to finding the population who like bananas for each row and then adding these up.
What's a simple way to explain why this is?
Obviously just a theoretical exercise - in real life more people like bananas!
Thanks
The reason is that you can't average percentages when they are referring to different-sized groups.
A very much simpler example is this:
Population A is 2 people, and one of them likes bananas. So 50% of Population A likes bananas.
Population B is 8 people, and two of them like bananas. So 25% of Population B likes bananas.
The combined population is 10 people, and three of them like bananas. So 30% of the combined population likes bananas.
However, the average of the two percentages is 36.5%, which is much bigger.
The issue is this: the two percentages do not represent the same-sized group of people. 50% of Population A (1 person) is smaller thant 25% of Population B (two people), even though as raw percentages it looks like 50% is bigger than 25%.
A way of dealing with this is to make sure the percentage from the bigger group counts for more, by multiplying it by the size of the group: $$ \frac{50\% \times 2 + 25\% \times 8}{2+8} $$ This is of course equivalent to just joining the banana-liking groups into a big group and finding the percentage out of the whole population.