Proving a property about the Covariance of two random variables.

334 Views Asked by At

I am struggling with this problem from an introduction to Machine Learning Course at a my University that I never fully wrapped my head around. Here's the problem as it was written:

For any two random variables $X$, $Y$ the covariance is defined as $\text{Cov}(X, Y ) = E\left[\left(X − E[X]\right)\left(Y − E[Y ]\right)\right]$. You may assume $X$ and $Y$ take on a discrete values if you find that is easier to work with.

a. [$1$ point] If $E[Y | X = x] = x$ show that $$ \text{Cov}(X,Y) = E[(X-E[X])^2] $$ b. [1 point] If $X, Y$ are independent show that $$ \text{Cov}(X, Y ) = 0 $$

I am mainly struggling on part $A$. I think that I can see how to get from the proof of part $A$ to the proof of part $B$.

Any help would be much appreciated, I'll show things I've tried and where I get stuck, at times it's seemed like I might have proved that $E[X] = E[Y]$ at least but these proofs are dubious at best.

Attempt 1:

$E[Y|X=x]=x$ is essentially the same (I think) as $E[Y|X]=X$ since for any value of $x$ we are saying that the expected value of $Y$ takes on $x$.

so: $$ E[Y|X]=X $$ taking the expected value of both sides: $$ E[E[Y|X]]=E[X] $$ Here I want to say that $E[E[Y|X]]$ is the same as $E[Y]$ because of the way that the expected value sums over all the $Y$'s. Pretty sure that's not the right thing to do.

Attempt 2:

Let $Z$ be a random variable defined by: $$ Z=Y-E[Y|X] $$ taking the expected value: $$ E[Z]=E[Y]-E[E[Y|X]] $$ again if the $E[E[Y|X]] = E[Y]$ I could prove that the $E[Z] = 0$ and get around to proving that $E[Y] = E[X]$, this is essentially a round about way of doing the above.

I'm pretty stuck for ideas at this point and any help is appreciated!

1

There are 1 best solutions below

2
On BEST ANSWER

I hope that this works (I'll assume that X and Y take on a discrete values):

a. $\text{If } \mathbb{E}[Y|X=x]=x, \text{ then } \text{cov}(X,Y)= \mathbb{E}[(X-\mathbb{E}(X))^2]$

Solution:

First, lets analyze:

$$\text{cov}(X,Y) = \mathbb{E}[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])] = \mathbb{E}[XY - X\mathbb{E}[Y] -\mathbb{E}[X]Y + \mathbb{E}[X]\mathbb{E}[Y]] = \mathbb{E}[XY] - \mathbb{E}[X\mathbb{E}[Y]] - \mathbb{E}[Y\mathbb{E}[X]] + \mathbb{E}[\mathbb{E}[X]\mathbb{E}[Y]] = \mathbb{E}[XY] -2\mathbb{E}[X]\mathbb{E}[Y] + \mathbb{E}[X]\mathbb{E}[Y] = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y]$$

Don't foget that, $$\mathbb{E}[Y|X=x_k] = x_k, \text{ for all } x_k \in R_X$$

Second, (I'll use the hypothesis on the second "="):

$$ \mathbb{E}[X] = \sum_{x_i \in R_X} x_i\mathbb{P}[X] = \sum_{x_i \in R_X} \mathbb{E}[Y|X=x_i]\mathbb{P}[X] = \mathbb{E}[\mathbb{E}[Y|X=x_i]] = \mathbb{E}[Y] $$

The last term is a result of theorem in Probability Theory.

Moving on:

$$ \mathbb{E}[XY] = \sum_{x_i \in R_X} \sum_{y_i \in R_Y} x_i y_i \mathbb{P}[X=x_i, Y=y_i] = \sum_{x_i \in R_X} x_i ( \sum_{y_i \in R_Y} y_i \mathbb{P}[X=x_i, Y=y_i]) = \sum_{x_i \in R_X}x_i ( \sum_{y_i \in R_Y} y_i \mathbb{P}[Y=y_i|X=x_i] \mathbb{P}[X=x_i]) = \sum_{x_i \in R_X}x_i \mathbb{P}[X=x_i] \sum_{y_i \in R_Y} y_i \mathbb{P}[Y=y_i|X=x_i] = \sum_{x_i \in R_X}x_i \mathbb{P}[X=x_i] \mathbb{E}[Y|X=x_i] = \sum_{x_i \in R_X}x_i \mathbb{P}[X=x_i] x_i = \sum_{x_i \in R_X}x_i^2 \mathbb{P}[X=x_i] = \mathbb{E}[X^2]$$

Finally, if we substitute on covariance:

$$\text{cov}(X,Y) = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] = \mathbb{E}[X^2] - \mathbb{E}[X]\mathbb{E}[X] = \mathbb{E}[X^2] - \mathbb{E}[X]^2 = \text{Var}[X] = \mathbb{E}[(X-\mathbb{E}[X])^2]$$

b. If X,Y are independet show that: $$\text{cov}(X,Y)=0$$

Solution:

$\text{If X,Y are independent, then } \mathbb{E}[XY]=\mathbb{E}[X]\mathbb{E}[Y]$

So,

$$\text{cov}(X,Y) = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y] = \mathbb{E}[X]\mathbb{E}[Y] - \mathbb{E}[X]\mathbb{E}[Y] = 0$$