I've got a big list of data that is split by industry, and 'keyword' within this. I'm trying to do an average per industry, as well as an average for overall, and wanted to check what is the most accurate way of finding averages here. I've checked other similar questions and couldn't quite be sure of the answer.
So I have data split as follows:
So the first Overall Average in the Score column is the average of all the numbers overall, the second is average after doing an average of each keyword, the third is the average of all the scores within an industry, and the final column is per keyword grouping per industry.
Obviously each of these comes up with a different average, but I want to be able to say:
- The average for finance is XX
- The average overall is XX
There will not necessarily be the same number within each industry or keyword.
Could you please let me know which way is the most acccurate to show an overall average, and an average of each industry?
Please let me know if that's not clear. Thank you in advance.
You are right to suspect that you can't simply average averages.
You have computed the average for the industries as
You cannot then average those by adding and dividing by 3, since the industries have different weights. There are $5$ entries for Finance but only $2$ for grocers and $4$ for football. The correct way to average them is $$ \frac{5 \times 5 + 2 \times 3 + 4 \times 4.5} {5 + 2 + 4}. $$
Then you will get the same overall average: $4.7$, so no new information.
You can report the averages per industry separately, you just can't average them.
By the way, if you do any of that you should stick to one digit after the decimal point. All the others are phony precision.