Simplifying the Formulas for Weighted Means

32 Views Asked by At

I am reading the following Wikipedia article (https://en.wikipedia.org/wiki/Pooled_variance) and came across the following formula:

$$\mu_{x \cup Y} = \frac{N_x \mu_x + N_y \mu_y}{N_x + N_y}. \tag{1}$$

If I were asked to calculate the variance of the above formula, I would have used first principles (i.e., $Var(ax + by) = a^2Var(x) + b^2Var(y) $) and done it like this (obviously take the square root) to arrive at:

$$\sigma_{x \cup Y} = \sqrt{\frac{N_x \sigma^2_x + N_y \sigma^2_y}{N_x + N_y}}. \tag{2}$$

However, the Wikipedia article tells me that the correct answer is this:

$$\sigma_{x \cup Y} = \sqrt{\frac{N_x \sigma^2_x + N_y \sigma^2_y}{N_x + N_y} + \frac{(N_x N_y)(\mu_x - \mu_y)^2}{(N_x + N_y)^2}}. \tag{3}$$

I can see that if $\mu_x = \mu_y = 0$, then $(2) = (3)$.

My Question: Can someone please show me how to derive the formula in $(3)$?

Thanks!

$$\mu_x = \frac{\displaystyle \sum_i N_{x_i} \mu_{x_i}}{\displaystyle \sum_i N_{x_i}}$$

$$\sigma_x = \sqrt{ \frac{ \displaystyle \sum_i (N_{x_i} \sigma^2_{x_i})}{\displaystyle \sum_i N_{x_i}} + \frac{\displaystyle \sum_{i<j} N_{x_i} N_{x_j} (\mu_{x_i} - \mu_{x_j})^2}{\Big(\displaystyle \sum_i N_{x_i}\Big)^2}}$$