Question about the relation between Expectation and Covariance

537 Views Asked by At

I have a question regarding the relationship between the Expectation $E(X)$ and Covariance $Cov(X, Y)$.

For reference, Wolfram MathWorld defines Expectation for a single discrete random variable as: $$E(f(X)) = \sum_{x}f(x)P(x)\qquad\qquad\qquad \cdots\qquad(i)$$ and Covariance of two discrete random variables as: $$Cov(X, Y) = E(XY) - E(X)E(Y)\qquad \cdots\qquad(ii)$$

But it also states the Covariance explicitly as: $$Cov(X, Y) = \sum_{i=0}^N\frac{(x_i - x_a)(y_i - y_a)}{N}\qquad \cdots\qquad(iii)$$

How do you get (iii) from (i) and (ii)? What happened to the P(x) and P(y)?

2

There are 2 best solutions below

0
On BEST ANSWER

In fact, covariance (and every statistics) has a population version and a sample version definition. For example, the quantity $EX$ is the population mean of the random variable $X$; the sample version of $EX$ is of the form $\sum_{i=1}^{n}x_{i}/n$ where $x_{1},\dots,x_{n}$ are some realizations of $X$. In the same token, technically speaking the "definitions" of covariance you saw are not the same thing. And $\text{cov}(X,Y)$ is in fact defined as $E[(X - EX)(Y-EY)]$, i.e. the expectation of the product of $X$ deviating from $EX$ and $Y$ deviating from $EY$. You expand it and will see the Wolfram-form. Given realizations $x_{1},\dots,x_{n}$ of $X$ and $y_{1},\dots,y_{n}$ of $Y$, the sample covariance is defined by replacing the leftmost "$E$" with "$\sum_{i=1}^{n}/n$" and replacing "$EX$" and "$EY$" with the sample means $x_{a}$ and $y_{a}$ of $X$ and $Y$, respectively.

Check some theoretical statistics books (not the applied ones) for a bigger picture of the underlying game. For example, you may want to check Hogg's Mathematical Statistics.

0
On

Equation (iii) is an empirical estimate of covariance (using a sample of points $x_1, .., x_n$ and $y_1, .., y_n$ distributed according to $X$ and $Y$ respectively) and a formula of it like equation (ii).

See https://en.wikipedia.org/wiki/Sample_mean_and_covariance