Combining Covariances of Two Sets

87 Views Asked by Bumbble Comm At 06 Apr 2026 - 12:56

According to Wikipedia, the formula for combining the covariances of two sets is: $$C_X=C_A+C_B+(\overline{x}_A-\overline{x}_B)(\overline{y}_A-\overline{y}_B) \cdot\frac{n_An_B}{n_X} $$ where:

$A$ and $B$ are the first and second sets.
$C$ is the Covariance.
$n$ is the number of samples.
$n_X = n_A + n_B$.
$x$ and $y$ are the features.

I implemented this formula by splitting one dataset into two equal sets, for testing purposes, yet the result is quite different from the original dataset covariance.

Now, let $M_{AB}$ be this part of the above formula:

$$(\overline{x}_A-\overline{x}_B)(\overline{y}_A-\overline{y}_B) \cdot\frac{n_An_B}{n_X}$$

Looking at this implementation, the author basically applied the following formula:

$$ C_X = \frac{(C_A \color{red}{\cdot n_A}) + (C_B \color{red}{\cdot n_B}) + M_{AB}}{\color{red}{n_X}} $$

which gives the correct combined covariance!.

I could not understand how the latter is derived or achieved algebraically? Or if it's even similar to the former formula? because there are extra $\color{red}{n_A}$ and $\color{red}{n_B}$ that are added to the second formula!

Your help is appreciated.

Original Q&A

Combining Covariances of Two Sets

Related Questions in STATISTICS

Related Questions in ALGORITHMS

Related Questions in COVARIANCE

Trending Questions

Popular # Hahtags

Popular Questions