Degrees of Freedom in Covariance: Intuition?

4.6k Views Asked by At

If we say $\operatorname{Var}(x)$ has $n-1$ degrees of freedom which are lost after we estimate $\operatorname{Var}(x)$, this matches how $n-1$ observations are now constrained to be sufficiently close to the remaining observation of $x$. In my class, $\operatorname{Cov}(x,y)$ is also described as having $n-1$ degrees of freedom which are lost after we estimate $\operatorname{Cov}(x,y)$. Reference for this use of "degrees of freedom" in covariance

My confusion is that the covariance does not actually seem to constrain $n-1$ $x$ and $y$ values, like the variance did for $x$.

Can you relate the $n-1$ degrees of freedom in covariance and the intuition for degrees of freedom as how many observations are not free to change after estimating $\operatorname{Cov}(x,y)$? Is it possible to explain by counting newly restricted observations, like when explaining why the sample mean has degree of freedom $1,$ or variance degrees of freedom $n-1$?

  • The second related question is about the explanation in the reference linked above:

    "Initially, we have $2n$ degrees of freedom in the bivariate data. We lose two by computing the sample means $m(x)$ and $m(y)$. Of the remaining $2n−2$ degrees of freedom, we lose $n−1$ by computing the product deviations. Thus, we are left with $n−1$ degrees of freedom total."

Do you see what are the "product deviations" and how does each one "lose" a degree of freedom?

1

There are 1 best solutions below

7
On

Intuitively, the deduction of one degree of freedom is necessary to resolve a problem about the "biased"-ness of the estimator. An "unbiased" estimation for a (co)variance is one where it's "expected" to equal the population (co)variance I.E. if you take a SAMPLING distribution of (co)variance estimations and the average (or "expected value") of that distribution is the (co)variance of the population.

The "intuitive" explanation for the loss in degree of freedom in the variance and coveriance are EXACTLY the same issue in that it is concerning "biased"-ness of the estimator, which needs to be fixed by subtracting a degree of freedom.

Not sure if an informal proof of this fact would very helpful, but I'm going to cook one up myself (in a later edit of this post) to fine tune my econometrics knowledge so I can effectively tutor that next semester.

Hope this helps! :)