Which Covariance Formula Is Best?

79 Views Asked by Bumbble Comm At 26 Mar 2026 - 1:30

I've been looking around, but I am having some difficulty figuring out why I am seeing multiple formulas for solving for the covariance.

From three different sources, the covariance formulas have appeared to be the following:

WolframMathworlds Definition: $$\text{Cov}(X,Y) =\sum_{i=1}^N \frac{(x_i-\bar x)(y_i - \bar y)}{N} \tag{1} $$

"Regression Analysis by Example": $$\text{Cov}(Y,X) =\frac{\sum_{i=1}^N (y_i - \bar y)(x_i-\bar x)}{N-1} \tag{2}$$

"Probability And Statistics: For Engineers and Scientists"

This formula appears to be used more for Discrete Joint Probability Distributions $$\text{Cov}(X,Y) =\sum_x \sum_y (x-\mu_x)(y-\mu_y) f(x,y) \tag{3}$$

A common theme between the three formulas is that all three of them involve some division (this applies to Formula $(3)$ as well since $0 \le f(x,y) \le 1$)

However, it seems that all three formulas will end up with different values for $\text{Cov}(X,Y)$, which in turn should give different results should you solve for the correlation coefficient.

I'm aware that the $N-1$ has to do with bias / unbiasness, but I thought that if that were the case, WolframMathWorld would've made some sort of point about this.

What I am asking is, are these formulas (mainly formulas $1$ and $2$) close enough that they can be used interchangeably? Or should a certain formula be used for certain situations? Is one formula better than another? Also could the covariance represent the same thing in probability as well as regression analysis?

Original Q&A

Which Covariance Formula Is Best?

Related Questions in PROBABILITY-DISTRIBUTIONS

Related Questions in COVARIANCE

Related Questions in REGRESSION-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions