Different Tables giving different ciritical values for Spearman

57 Views Asked by At

First of all, can you please check my understanding.

With 4 pairs in Spearman, there are 4! ways of arranging the data. Only 1 will have perfect score of 1 and only 1 will have a perfect score of -1. As we approach lower ranks, there are more possibilities our score happened by chance. Is this correct?

What do the tables actually show in layman's terms

Secondly, when looking to tables such as http://web.anglia.ac.uk/numbers/biostatistics/spearman/local_folder/critical_values.html https://mrphillipsibgeog.wikispaces.com/file/view/geog_SpearmanExplained.pdf/410697184/geog_SpearmanExplained.pdf We have different values - Why is this.

Thank you.

1

There are 1 best solutions below

0
On

Normal approximations. In an earlier Comment (now deleted) I speculated that one reason for differences among tables might be that some use a normal approximation and some do not. In checking around, it looks as if this may be true in some cases, but it seems that most tables do not use normal approximations until $n$ is near 30, or greater. The Wikipedia article on 'Spearman rank correlation' gives a readable account of normal approximations; some computation based on, but beyond, the formulas there would be required to get a table of critical values for various significance levels.

Simulation. I also speculated in my former Comment that some tables might be based on simulation results with too few iterations. Again, this might be true in some cases, but in simulations of my own I found that for small $n$ there may be several legitimate interpretations of simulation results. I discuss this difficulty below.

Permutation distribution of $r_S$ under the null hypothesis of independence. If we knew the distribution of $r_S$ for data $(X_i, Y_i),$ where $X$ and $Y$ are independent, then the critical values of a test of $H_0: \rho_S = 0$ against the two-sided alternative at the 10% level would be found by cutting 5% from each tail of the null distribution. Several tables I looked at have the critical values $\pm 0.564.$

In order to simulate the null distribution when $n = 10,$ we take two samples of size 10 from $Unif(0, 1),$ find their ranks, and then find the Pearson correlation of the two vectors of ranks. If we do this 100,000 times, we can get a good approximation of the null distribution of $r_S$ and then the critical value. (This is the 'permutation procedure' mentioned in the Wikipedia article.) The code for this simulation in R statistical software is shown below.

m = 10^5;  n = 10;  r.s = numeric(m)
for (i in 1:m) {
   r.s[i] = cor(rank(runif(n)), rank(runif(n))) }
quantile(r.s, c(.05, .95))
##         5%        95% 
## -0.5515152  0.5515152 

These simulated critical values $\pm 0.552$ are in the same ballpark as the tabled values $\pm 0.564,$ but the tabled values are not within the expected margin of simulation error. What's wrong?

Compensating for discreteness. The difficulty stems from the fact that the ranks are discrete. And this discreteness is inherited by $r_S.$ The R code length(unique(r.s)) returns 162. That is, among the $m = 100,000$ simulated values of $r_S$ there are only 162 distinct values. This makes it difficult to pin down precise critical values.

 q.95 = quantile(r.s, .95); q.95
 ##       95% 
 ## 0.5515152 
 mean(r.s >= q.95);  mean(r.s > q.95)
 ## 0.05337   # P(r.s >= 0.552) = .053
 ## 0.04904   # P(r.x > 0.552) = .049

So if we were to use 0.552 to cut "5%" from the upper tail of the null distribution, we would actually cut somewhere between 5.3% and 4.9%. To be sure that less than 5% lies above the critical value, we have to go up the next higher one of the 162 unique values of $r_S,$ which is 0.564:

 mean(r.s >= 0.563)
 ## 0.04904

So, because of the discreteness of the null distribution of $r_S,$ we cannot test at exactly the $2(.05) = 10$% level. Instead, we need to use critical values $\pm 0.564$ as in the printed table, testing instead at the $2(0.049) = 9.8$% level.

Similarly, for a two-sided test at the 2% level, the table gives critical values $\pm 0.745,$ while the simulation gives $\pm 0.721.$ The exact significance level using $\pm 0.745$ is $0.74$%, not $2.0$%.

Finally, I believe that the main discrepancies among printed critical values is due to rounding and discreteness, and would not disappear with larger-scale simulations.

The graph below shows the simulated null distribution of $r_S$ for $n=10.$ More iterations would make a slightly smoother graph, but would not change the critical values. The vertical red lines show approximate critical values for a two-sided test at the 10% level.

enter image description here