Updating mean value and standard deviation

473 Views Asked by At

I have a set of data $\{x_1,\ldots, x_N\}$, with large $N$. To compute the mean value I perform the operation:

$$\mu = \frac{1}{N}\sum_{i=1}^Nx_i.$$

The standard deviation is dependent on the value of $\mu$, of course. Suppose that I add $x_{N+1}$ to my data set, so that the new mean value is:

$$\mu' = \frac{1}{N+1}\sum_{i=1}^{N+1}x_i = \frac{1}{N+1}\sum_{i=1}^{N}x_i + \frac{x_{N+1}}{N+1} = \frac{\mu N + x_{N+1}}{N+1}.$$

Instead of evaluating the complete sum, $\mu'$ can be computed after only a few operations. This is desirable if I constantly add, or remove data from my dataset. For instance, if I remove $x_k$, then:

$$\mu' = \frac{\mu N - x_k}{N-1}.$$

However I see that updating the standard deviation $\sigma$ is not so simple. Does anyone know a "quick" formula for updating $\sigma$? Thanks a lot.

1

There are 1 best solutions below

0
On BEST ANSWER

Note that $$\sum \limits_{i = 1}^n (x_i - \overline x)^2 = \sum \limits_{i =1}^n (x_i^2 -2x_i\overline x +\overline x^2) = \sum \limits_{i = 1}^n x_i^2 - 2n\overline x^2 + n\overline x^2 = \sum \limits_{i = 1}^n x_i^2 - n\overline x^2$$ holds, where $\overline x$ denotes the mean of the $x_i$. Now you just need to update the mean and the sum of the squares of the $x_i$.