Spearmans Rank, why does it work?

122 Views Asked by At

Looking at spearmans rank, can someone explain how the forumula works, is their anything intuative about it?

1

There are 1 best solutions below

1
On BEST ANSWER

Spearman's rank correlation is calculated by ranking bivariate normal variables ${X_i},\;{Y_i}$ as variables ${x_i},\;{y_i}$. Pearson's correlation between the ranked variables is then given by: $$\rho = \frac{\sum_i(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_i (x_i-\bar{x})^2 \sum_i(y_i-\bar{y})^2}}$$

Since there are no ties, the x's and y's both consist of the integers from 1 to n inclusive. We can, therefore, rewrite the denominator as $$ \sqrt {\sum\nolimits_i {{{({x_i} - \mathop x\limits^\_ )}^2}} } \sqrt {\sum\nolimits_i {{{({x_i} - \mathop x\limits^\_ )}^2}} } = \sum\nolimits_i {{{({x_i} - \mathop x\limits^\_ )}^2}} $$

After some algebra, this yields:

$$\sum_i (x_i-\bar{x})^2 = \sum_i x_i^2 - n\bar{x}^2 \\ \quad= \frac{n(n + 1)(2n + 1)}{6} - n(\frac{(n + 1)}{2})^2\\ \quad= n(n + 1)(\frac{(2n + 1)}{6} - \frac{(n + 1)}{4})\\ \quad= n(n + 1)(\frac{(8n + 4-6n-6)}{24})\\ \quad= n(n + 1)(\frac{(n -1)}{12})\\ \quad= \frac{n(n^2 - 1)}{12}$$ Now look at the numerator: $$\sum_i(x_i-\bar{x})(y_i-\bar{y})\\ \quad=\sum_i x_i(y_i-\bar{y})-\sum_i\bar{x}(y_i-\bar{y}) \\ \quad=\sum_i x_i y_i-\bar{y}\sum_i x_i-\bar{x}\sum_iy_i+n\bar{x}\bar{y} \\ \quad=\sum_i x_i y_i-n\bar{x}\bar{y} \\ \quad= \sum_i x_i y_i-n(\frac{n+1}{2})^2 \\ \quad= \sum_i x_i y_i- \frac{n(n+1)}{12}3(n +1) \\ \quad= \frac{n(n+1)}{12}.(-3(n +1))+\sum_i x_i y_i \\ \quad= \frac{n(n+1)}{12}.[(n-1) - (4n+2)] + \sum_i x_i y_i \\ \quad= \frac{n(n+1)(n-1)}{12} - n(n+1)(2n+1)/6 + \sum_i x_i y_i \\ \quad= \frac{n(n+1)(n-1)}{12} -\sum_i x_i^2+ \sum_i x_i y_i \\ \quad= \frac{n(n+1)(n-1)}{12} -\sum_i (x_i^2+ y_i^2)/2+ \sum_i x_i y_i \\ \quad= \frac{n(n+1)(n-1)}{12} - \sum_i (x_i^2 - 2x_i y_i + y_i^2) /2\\ \quad= \frac{n(n+1)(n-1)}{12} - \sum_i(x_i - y_i)^2/2\\ \quad= \frac{n(n^2-1)}{12} - \sum d_i^2/2,\ \ (with\ {d_i} = {x_i} - {y_i})$$

The required fraction (i.e. Spearman's Rank Correlation Coefficient) is therefore: $$\frac{n(n+1)(n-1)/12 - \sum d_i^2/2}{n(n^2 - 1)/12}\\ \quad= {\frac {n(n^2 - 1)/12 -\sum d_i^2/2}{n(n^2 - 1)/12}}\\ \quad= 1- {\frac {6 \sum d_i^2}{n(n^2 - 1)}}\, \bullet $$ (...hope this makes everyone feel contented).