I can't understand a term of a dispersion formula

29 Views Asked by At

I'm studying an article about similarity between trees of classifications procedures, and I cannot understand why the author used a term in his formula.

$$\frac{1}{M(M-1)} \sum_{1 \le i < j \le M} d(A_i, A_j)$$

From his words:

An heuristic motivation for this expression comes from its relationship with the alternate expression for the variance of a set of observations {X1, . . . , Xn} given, for example, in Serfling (1980):

$$ n^{-1} \sum_{i=1}^{n} (X_i - \bar{X})^2 = \frac{1}{n(n-1)} \sum_{1 \le i < j \le n} (X_i, X_j)^2$$

I could understand the sum, it's a sum of all the distances, in the general case, and from my case it's the dissimilarity (a type of distance).

However why it's used the $\frac{1}{n(n-1)}$ ? I would like a easy/intuitive explanation.

(I know it's probably some of the most easiest things in the world)