Total average of averages not same as the average of total values

3.9k Views Asked by At

I am having a strange problem while computing the overall percentage. Let me demonstrate my problem using an example.

Assume that I am receiving multiple batches of apple from the vendor and the number of good apple received in each batch are computed to display as percentage as below:

enter image description here

If I find the average of the last column (Good Apple Percentage),

I get $\frac{(98.5+98+99)}{3} = 98.5 $%

However, if I find the total no of apples and total number of bad apples first (column 2 and 3),

Total No of Apples $= 200+100+300 = 600$

Total No of Bad Apples $= 3+2+3= 8$

Then compute the total percentage, $\frac{600-8}{600}= 98.67$%

Why is the total average of the average not the same as the average of total values? There is no rounding up or rounding down necessary for individual percentages (Column 4), so shouldn't the two percentage be the same? What am I missing?

2

There are 2 best solutions below

2
On BEST ANSWER

The reason you are getting different numbers is because if you look at just the averages, it completely obliterates any sense of the size of the original data. Let me give you a more extreme example to illustrate:

Batch A: 2 apples total, 1 bad apple, 50% good apples Batch B: 100 total apples, 1 bad apple, 99% good apples

Totaling the two we have 102 apples, 2 of which are bad, 100 good. But if we average just the percentages, we are at $74.5%$.

The moral of the story is, you can't just average the averages because it loses all sense of how big the individual samples are.

0
On

Why is the total average of the average not the same as the average of total values?

You could think of those percentages as probabilities for an apple to be a good one. There are $3$ times more apples with a $99\%$ chance of being good vs. $98\%$ $-$ specifically $300\;@99\%$ vs. $100 \;@98\%$. When combining those to calculate the percentage for the whole lot, you must take into account the relative sizes of each batch. This is precisely what the weighted average does: $$\frac{200 \cdot 98.5 + 100 \cdot 98 + 300 \cdot 99}{200+100+300}=98.67$$