Combining Variance and Sum of Variance Law

172 Views Asked by At

Let's say I want to find the find out how much time my classmates spent preparing for their classes.

I collected data from 10 people, and found out the standard deviation is 0.8 hours with a mean of 2.4 hours.

But, later I felt that the sample size was too small, so I collected data from another 5 people, now with a standard deviation of 1.2 hours and a mean of 2 hours.

So, I want to combine these 2 data sets together.

By sum of variance law Var(X ± Y) = Var(X) + Var(Y), for indepedent X and Y. Hence, the combined standard deviation is s = ((1.2^2) + (0.8^2))^(1/2) = 1.44.

However, when I recomputed the standard variance from scratch using sample variance, I got a value of s = 0.93.

Why is this so? Am still trying to figure out when it is appropriate to use the sum of variance law.

1

There are 1 best solutions below

0
On

You shouldn't sum $X+Y$. Let $\{X_i\}_{i=1}^{15}$, you already compute the following estimators: $$\bar{X}_1=\frac{1}{10}\sum_{i=1}^{10}X_i,\qquad\bar{X}_2=\frac{1}{5}\sum_{i=11}^{15}X_i$$ $$S_1^2=\frac{1}{9}\sum_{i=1}^{10}\left(X_i-\bar{X}_1\right)^2,\qquad S_2^2=\frac{1}{4}\sum_{i=11}^{15}\left(X_i-\bar{X}_2\right)^2$$ Now you want to compute $$\bar{X}=\frac{1}{15}\sum_{i=1}^{15}X_i,\qquad S^2=\frac{1}{14}\sum_{i=1}^{15}\left(X_i-\bar{X}\right)^2$$ It is false your hypothesis of $S^2=S_1^2+S_2^2$, but you can solve the estimators as: $$\bar{X}=\frac{1}{15}\left(10\bar{X}_1+5\bar{X}_2\right)$$ $$S^2=\frac{9S_1^2+4S_2^2}{14}$$