Finding covariance using the expectation values

105 Views Asked by At

I'm trying to calculate the covariance for an example that I've created - using the covariance formula Cov(X,Y) = E(XY)-E(X)E(Y) as in this question - but I'm running into trouble.

In my example, I roll a 3-sided die 150 times and count how many times each side appears. In R I can simulate this for 1,000 rolls and show the first 3 results like so:

library(tidyverse)
set.seed(0)
m <- replicate(n = 1000, table(sample(c('X','Y','Z'), size=150, p=c(1/3,1/3,1/3), replace = TRUE))) %>% t()
m %>% head(3)

Which yields:

##  X   Y   Z
##  48  42  60
##  42  59  49
##  54  45  51

I can compute the covariance like so:

cov(m)

Which yields:

##      X           Y           Z
##  X   31.89802    -16.08600   -15.81202
##  Y   -16.08600   31.47373    -15.38773
##  Z   -15.81202   -15.38773   31.19976

Now, I think:

  • E[XY] = 2500
  • E[X] = 50
  • E[Y] = 50

... this gives me:

  • Cov(X,Y) = E(XY)-E(X)E(Y) = 2500-(50*50) = 0

What am I doing wrong / how do I calculate the covariance correctly? It looks to be about -1/2 the variance...

1

There are 1 best solutions below

0
On BEST ANSWER

A trick for computing $\text{Cov}(X,Y)$ is to note that $X+Y+Z=150$ always. \begin{align} 0 &= \text{Var}(X+Y+Z) \\ &= \text{Cov}(X+Y+Z, X+Y+Z) \\ &= \text{Var}(X) + \text{Var}(Y) + \text{Var}(Z) + 2\text{Cov}(X,Y) + 2 \text{Cov}(X, Z) + 2 \text{Cov}(Y, Z). \end{align} Since $X$, $Y$, and $Z$ are exchangeable, the last line equals $3\text{Var}(X) + 6 \text{Cov}(X,Y)$. Setting this equal to zero yields $\text{Cov}(X,Y) = -\frac{1}{2} \text{Var}(X)$ which matches what you observed in your simulation.

The error in your theoretical computation is $E[XY]=2500$, I don't know how you arrived at this number.