If $X$ and $Y$ are a list of numbers, why does the Avg($XY$) / Avg $Y \neq$Avg$X$

142 Views Asked by At

Title is pretty descriptive.

Excel sheet representation I made

3

There are 3 best solutions below

0
On

This is because $Avg(X) \cdot Avg(Y) \ne Avg(X\cdot Y)$.

Computing the averages: $$ \left(\frac{x_0 + x_1 + \cdots}{n} \right)\left(\frac{y_0 + y_1 + \cdots}{n}\right) = \frac{(x_0 + x_1 + \cdots)(y_0 + y_1 + \cdots)}{n^2} \\ \frac{(x_0 + x_1 + \cdots)(y_0 + y_1 + \cdots)}{n^2} \ne \frac{x_0y_0 + x_1y_1 + \cdots}{n} $$

Notice that if you expand $(x_0 + x_1 + \cdots)(y_0 + y_1 + \cdots)$, you get $$x_0y_0 + x_0y_1+ x_0y_2 + \cdots + x_1y_0 + x_1y_1 + x_1y_2 + \cdots + x_2y_0 + x_2y_1+x_2y_2 + \cdots$$.

0
On

In general, for random variables $\mathbb E[XY]\ne \mathbb E[X]\mathbb E[Y]$. This is only true when $X$ and $Y$ are independent, which is a stronger condition than being uncorrelated. So unless your sampled numbers are from two uncorrelated Gaussian distributions (for which uncorrelated is a sufficient condition for the equality of expectations), it is not reasonable to assume that the product of the sample means should equal the mean of the sum of products.

0
On

The first and foremost reason is this:

You had no good reason to expect the two things to be equal.

The second reason is that you tried it and found they are not equal.

The rules for canceling terms in a ratio are very specific. The basic rule is that if you can pull out a single number that is a factor of the top of the ratio and you can pull out the same single number as a factor of the bottom of the ratio, then you can cancel that factor:

$$ \mathbf{If}\quad( A = k\times C\mathbf{\ and\ }B = k\times D)\quad \mathbf{then}\quad \frac AB = \frac{k\times C}{k\times D} = \frac CD. $$

But in $\operatorname{Avg}(XY)$, recall that $$ \operatorname{Avg}(XY) = \frac{x_1 y_1 + x_2 y_2 + \cdots + x_n y_n}{n}.$$

What single number would you pull out as a factor from this? Perhaps the factor $\frac 1n$, but that's not what you tried to do in the question.

Now if it turned out that all the $Y$ values were the same, that is, if $y_1 = y_2 = \cdots = y_n,$ then you could say that every single one of them is just the number $k = \operatorname{Avg}(Y)$ and now you have a single number $k$ that you can pull out like this:

$$ \frac{x_1 \times k + x_2 \times k + \cdots + x_n \times k}{n} = k \times\frac{x_1 + x_2 + \cdots + x_n}{n}. $$

But that doesn't generally work if the $Y$ values are not all the same.


There is a related topic called a weighted average. The idea of a weighted average of a list of numbers $x_1, x_2, \ldots, x_n$ is that some of the numbers deserve to have more "weight" in the average, sort of like if you had an election and some people had more votes than others did.

The weighted average is done by making a list of "weights," $w_1, w_2, \ldots, w_n$, usually chosen so that $w_1 + w_2 + \cdots + w_n = 1$. Then the weighted average is

$$ w_1 x_1 + w_2 x_2 + \cdots + w_n x_n. $$

If you set $w_1 = w_2 = \cdots = w_n = \frac1n$ then the weighted average turns out to be just the ordinary average, $$ \frac1n x_1 + \frac1n x_2 + \cdots + \frac1n x_n =\frac{x_1 + x_2 + \cdots + x_n}{n}, $$

but in general the whole point of doing a weighted average is that we do not set all the weights equal and we get something where the average ends up getting pulled toward the numbers with the greatest weight.

In an extreme case you can even put all the weight one number, for example set $w_1 = 1$ and $w_2 = w_3 = \cdots = w_n = 0,$ so that the weighted average of $x_1, x_2, \ldots, x_n$ comes out to just $x_1$ -- that is, the first number gets all the "votes" and it alone decides what the average will be.

In summary, not only is there no reason to think that $\frac{\operatorname{Avg}(XY)}{\operatorname{Avg}(Y)} = \operatorname{Avg}(X),$ there is even a related kind of average called a weighted average in which each $x_i$ is multiplied by a different number from another list, which is used because we want to get an "average" that is different from the ordinary average $\operatorname{Avg}(X).$