Covariance/Correlation Proof

718 Views Asked by At

I'm having a little problem with a statistics problem I am working on. I'm not really sure where to start to prove the two statements. Any help would be greatly appreciated.

Let $x$ and $y$ be jointly distributed numeric variables and let $z = a + by$, where $a$ and $b$ are constants.

Show that $\text{cov}(x, z) = b\, \text{cov}(x, y)$.

Show that if $b > 0, \text{cor}(x, z) = \text{cor}(x, y)$.

2

There are 2 best solutions below

2
On
  1. I would start with $$\text{Cov}(X,Z) = E[XZ]-E[X]E[Z],$$ and recall that $Z =a+bY$. Alternatively, $$\text{Cov}(X, Z) = \text{Cov}(X,a+bY)$$ and use bilinearity properties.

  2. I would use $$\text{Corr}(X,Z) = \frac{\text{Cov}(X,Z)}{\text{SD}(X)\text{SD}(Z)}=\frac{\text{Cov}(X,a+bY)}{\text{SD}(X)\text{SD}(a+bY)}.$$

2
On

I have a feeling this person is in my class, and I am equally confused. This is all of the information given for the problem. We have not learned about the expected values. The very small section in our text is extremely vague, and does not elaborate on covariance at all.

Below is the entire section from our textbook, and we've learned nothing beyond this. I'm not sure how we can figure this out with only this information. I've ommitted the actual formula because it does not translate well to this site. Sorry if there are any weird characters.

If x and y come from samples of size n rather than the whole population, replace the denominator n by n - 1 and the population means µ(x), µ(y) by the sample means ¯x, ¯y to obtain the sample covariance.

The sign of the covariance reveals something about the relationship between x and y. If the covariance is negative, values of x greater than µ(x) tend to be accompanied by values of y less than µ(y). Values of x less than µ(x) tend to go with values of y greater than µ(y), so x and y tend to deviate from their means in opposite directions.

If cov(x, y) > 0, they tend to deviate in the same direction. The strength of these tendencies is not expressed by the covariance because its magnitude depends on the variability of each of the variables about its mean. To correct this, we divide each deviation in the sum by the standard deviation of the variable. The resulting quantity is called the correlation between x and y: cor(x, y) = cov(x, y) sd(x) ⇤ sd(y)

The correlation between payroll and employees in the example above is 0.9782 (97.82 %). Theorem 2.1. The correlation between x and y satisfies %1  cor(x, y)  1. cor(x, y)=1 if and only if there are constants a and b > 0 such that y = a + bx. cor(x, y) = %1 if and only if y = a + bx with b < 0.

A correlation close to 1 indicates a strong positive relationship (tending to vary in the same direction from their means) between x and y while a correlation close to %1 indicates a strong negative relationship. A correlation close to 0 indicates that there is no linear relationship between x and y. In this case, x and y are said to be (nearly) uncorrelated. There might be a relationship between x and y but it would be nonlinear.