Firstly, I apologise if I get this notation wrong. I'm a software engineer much more than I am a mathematician.
I have the following variables, derived from an input set of $x$ and $y$ values. I do not have access to the original set of $x$ and $y$ values from which these were calculated.
$$n = count(each(x))$$ $$\sum_{} x = sum(x)$$ $$\sum_{} y = sum(y)$$ $$\sum_{} xy = sum(xy)$$ $$\bar{x} = mean(x)$$ $$\bar{y} = mean(y)$$
So essentially I know the means, I know the totals and I know the number of inputs, but I don't have the original $x$ and $y$ values (or rather, I'm trying to save CPU cycles by avoiding iterating over them again).
Is it possible to derive the covariance between $x$ and $y$ from these pre-computed values?
I'll assume that the formula for covariance here is:
$$ \frac{ \sum_{} (x-\bar{x})(y-\bar{y}) }{n} $$
I had a go at solving this myself but I came up short and this produced the wrong result. Here was my approach.
First I looked to expand the body of the summation as such:
$$ \frac{ \sum_{} (\bar{x}\bar{y} - \bar{x}y - \bar{y}x + xy) }{n} $$
I then split the summation apart along the addition/subtraction lines:
$$ \frac{ \sum_{} \bar{x}\bar{y} - \sum_{} \bar{x}y - \sum_{} \bar{y}x + \sum_{} xy }{n} $$
The given $\bar{x}$ and $\bar{y}$ are effectively constants here, I extracted them from the body of the summation.
$$ \frac{ \bar{x}\bar{y} - \bar{x}\sum_{} y - \bar{y}\sum_{} x + \sum_{} xy }{n} $$
At this stage the original covariance formula has been expanded into a form that only uses the variables I already have and does not rely on any individual $x$ or $y$, so I thought this should work. However when I plug in real values this doesn't work at all. Where have I gone wrong?
Many thanks!
OMG I think I just solved this purely through the process of writing it up and reviewing what I've written. Obviously I need to multiply the summation of $\bar{x}\bar{y}$ by $n$ when I expand the summation. This now seems to work as expected.
$$ \frac{ n\bar{x}\bar{y} - \bar{x}\sum_{} y - \bar{y}\sum_{} x + \sum_{} xy }{n} $$