Confusion with effect of averaging on standard deviation

35 Views Asked by At

First of all, I was reading a text and stuck with this part:

enter image description here

Now regarding the above quote, I want to give two scenarios I made up to make my question clear:

Scenario 1: Imagine we take N measurements from same sensor, and each measurement consists of 10 individual temperature readings (samples). And imagine each measurement has a standard deviation estimate of 0.5 degrees Celsius. (So we have in total 10×N data points from N measurements)

Scenario 2: Now imagine we take one single measurements from the same sensor, and this single measurement consists of 10×N individual temperature readings (samples). (We now have again in total 10×N data points but from a single measurement)

My questions are:

1-) What scenario is the text talking about? What does it mean by measurements?(A measurement is a data point or an array of data points in the text?)

2-) After averaging, what happens standard deviation in both case(the Scenario 1 and Scenario 2 in my examples)?

1

There are 1 best solutions below

1
On

In both cases you have $10N$ measurements. But there multiple different random variables these could be sampling:

Case 0 - Let's call $T$ the random variable that represents a single measurement. You've told me that $stdev(T) = 0.5$, and you've got $10N$ samples of this random variable. This isn't either of your scenarios.

Case 1 - We can define a random variable $U$ as "take ten measurements and average them together". You might write this as $U=(T_1 + ... + T_{10})/10$. By grouping and averaging your $10N$ measurements you would yield $N$ samples of $U$, and the text is telling you that $stdev(U) = 0.5/\sqrt {10}$. This corresponds to your scenario 1, roughly, and is the key take away from this portion of your text.

Case 2 - You could, if you wanted to, average together all of the $10N$ measurements. This would correspond to a single sample of a random variable $V = (T_1 + ... +T_{10N})/(10N)$. This corresponds roughly to your scenario 2, and isn't very realistic, but it would be the case that $stdev(V) = 0.5/\sqrt{10N}$.

Two things worth mentioning are:

  • The key process taking place is that instead of a single measurement, we take multiple measurements and average them together. These averages have smaller $stdev$s than the single measurements, and the $1/\sqrt{\textrm{# of measurements averaged}}$ is the factor by which the $stdev$ shrinks. Notice that your text describes this poorly, perhaps even flat out incorrectly, in its second description, when it says "reduce ... by the square root of the number of averages". It is, as it says previously the number of measurements that are averaged together that determines the scaling factor.

and

  • Statistics notation sucks. When they write $\bar s$ you have no idea which random variable's $stdev$ they are referring to. I tried to make it more explicit with my notation, but most stats texts do this poorly.