Understanding weighted standard deviation

3.4k Views Asked by At

Here is the formula that I use for weighted standard deviation: http://www.itl.nist.gov/div898/software/dataplot/refman2/ch2/weightsd.pdf.

My question is: why do we care about the number of nonzero weights? Isn't the value of N'-1 / N' very close to one?

Also, if I am also calculating the weighted absolute standard deviation (like http://en.wikipedia.org/wiki/Absolute_deviation#Mean_absolute_deviation_.28MAD.29, but weighted), do I need to care about the number of nonzero weights?

1

There are 1 best solutions below

4
On BEST ANSWER

The number of nonzero weights is effectively the sample size. If we have $w_i=0$ then we are ignoring the $i$th observation, so it doesn't really count as part of our sample.

You are also correct that $(N'-1)/N'$ will be close to one if $N$ is large. The reason that this isn't just ignored in the first paper you cited is because we like our estimators to be unbiased. That is, we want $E[s]=\sigma$, and we want $E[s_w] = \sigma$ where $\sigma$ is the true population standard deviation, and $E$ denotes the expected value.

It just so happens that by using this strange looking formula, it actually turns out that $s,s_w$ are unbiased estimators of $\sigma$. This is a tedious exercise to show, but it is accessible to an undergraduate level statistics student.

As for your second comment, if you are calculating weighted absolute standard deviation as in the first paper you linked, then yes of course, $N'$ appears directly in the formula. As for modifying the MAD formula on wikipedia to be weighted somehow, I would imagine it would appear in much the same way, though it is not immediately obvious to me how to modify that formula in an unbiased manner.