Sum of averages vs average of sums

178.1k Views Asked by At

I have essentially a table of numbers -- a time series of measurements. Each row in the table has 5 values for the 5 different categories, and a sum row for the total of all categories.

If I take the average of each column and sum the averages together, should it equal the average of the rows' sums (ignoring rounding error, of course)?

(I've got a case where the two values keep coming out different by about 30% and I'm wondering just how crazy I am.)

Update: See below -- I was (slightly) crazy and had an error in my code.

4

There are 4 best solutions below

0
On BEST ANSWER

The average of the entries in a column is the sum of the entries in that column, divided by the number of entries. The number of entries is the number of rows. So the sum of the averages is the sum of all the entries in the table, divided by the number of rows.

The average of the row sums is the sum of all entries in the table divided by the number of rows, so you should get the same number either way.

1
On

Sigh!! Found my problem -- it was a stupid "dupe error" in my code. I was looking for an error in the "average of sums" logic, but it was in the "sum of averages" logic -- referencing the wrong variable.

Well, anyway, we've demonstrated about 5 ways from Sunday that the sum of averages really IS equal to the average of sums, in case that's important to anyone in the future.

4
On

If I take the average of each column and sum the averages together, should it equal the average of the rows' sums (ignoring rounding error, of course)?

Generally, no.

1
On

Actually Steve could be correct. I'll give you a simple example and then explain why intelligent people can come up with different answers because in a way, they're both "right."

First row: 5; 6; Second row: 1; 2; Third row: 3; 4;

If you do either the sum of the averages or average of the sums as Daniel asked, then you'll get 7 as the answer. If however, you remove the 1 leaving a hole in your table, then your average of the sums drops to 6 2/3 and your sum of the averages increases to 8.

If your table of data has blanks or missing data points, then the two are almost never the same. If the table of data is equally/evenly distributed without any missing points or holes in the table, then they should always be the same. Anybody can test this out with MS Excel and the =RAND() function. Generate a table with any number of rows/columns and fill in the rows and columns with random numbers or let it generate random numbers for you. Then use =AVERAGE() to average the columns and =SUM() to add up the averages. Then reverse the process and use =SUM() to add the rows and =AVERAGE() to average the sums. If the table is complete, then the two numbers will be precisely the same. If however, your data for any reason is missing entries, then it can vary by a great percentage. Just start deleting data points in the middle of the table and watch the two results greatly fluctuate.

Also of notes is if you flip the rows and columns then you get completely different results, so make sure you're consistent. If you average the rows in the above example and sum the averages, or sum the columns and average the sums, then you get 10.5 with a complete table and 11 and 10, respectively with the 1 missing.