Summing up Standard Deviations - best approximation

114 Views Asked by At

For a long data series the overall Standard Deviation shall be collected. However, due to memory constraints the data has to be aggregated per day - in a way that only one number is stored per day (instead of the actual data series).

My current approach is

  • collect the variance of the data series per day (one number)
  • calculate the average of all variances
  • calculate square root of this average

if I compare this number to the actual Standard Deviaion over the complete data series over all days, I see that it is a reasonably good approximation of the Standard Deviation (usually below 1% off)

How do I calculate the precise deviation of the actual Standard Deviation against the approximation (ie the square root of average of aggregated variances)?

1

There are 1 best solutions below

1
On BEST ANSWER

If you want to know the deviation precisely, you'll need to calculate both quantities precisely. You lose information about the data (a lot) when you save just the variance.

But you can get pretty close by looking at a pooled variance. The only additional information you'll need is the number of data points in the daily sample. (You might also save the mean of the data while you're at it.)