Wilcoxon signed-rank test

735 Views Asked by At

While reading Wikipedia, and my teacher's notes I found that Wilcoxon signed rank test for $n>10$ is given like below:

Under null hypothesis, W follows a specific distribution with no simple expression. This distribution has an expected value of 0 and a variance of $$\frac{N_r(N_r + 1)(2N_r + 1)}{6}$$ $W$ can be compared to a critical value from a reference table. [1] The two-sided test consists in rejecting $H_0$, if $|W| \ge W_{critical, N_r}$. As $N_r$ increases, the sampling distribution of $W$ converges to a normal distribution. Thus, For N_r \ge 10, a $z$-score can be calculated as $$z = \frac{W}{\sigma_W}, \sigma_W = \sqrt{\frac{N_r(N_r + 1)(2N_r + 1)}{6}}$$ If $|z| > z_{critical}$ then reject $H_0$ (two-sided test)

Reference: [1]

On the other hand, im using 4 other books were for the same test, mean it's said to be: $\mu_T = \frac{N_r(N_r + 1)}{4}$ and variance $\sigma_T=\sqrt{\frac{N_r(N_r + 1)(2N_r + 1)}{24}}$

Which is the right one?

1

There are 1 best solutions below

1
On BEST ANSWER

There are at least two versions of the Wilcoxon signed-rank test, used to test whether the population median of paired differences is $0.$

Suppose you have $N$ differences $d_i = x_i - y_i.$ Delete any pairs with $d_i = 0$ because they contain no useful information for our purposes. Call the reduced sample size after any deletions $N_r.$ Then rank the absolute values $|d_i|$ from $1$ through $N_r$ to obtain $r_i$. For simplicity, we assume that there are no ties, so the $r_1$ are integers from $1$ through $N_r.$ Next 'sign' the ranks to get $s_i$: that is, if $d_i > 0,$ then $s_i = r_i$; and if $d_i < 0,$ then $s_i = -r_i.$

(1) In the Wikipedia page to which you link, the statistic $W$ is found by summing all $N_r$ of the $s_i.$ Thus under $H_o,$ we have $E(W) = 0.$ Also, $SD(W)$ is as Wikipedia says.

(2) In the version you found elsewhere, the number $T^+$ is found by summing only the $positive$ signed ranks $s_i$, and $T^-$ is the absolute value of the sum of the $negative$ signed ranks. Then the test statistic $T$ is the smaller of $T^+$ and $T^-.\,$ One can show that $E(T) = N_r(N_r + 1)/4.$ Also, $SD(T)$ is as in the second equation you quoted.

I don't know what 'other books' you have at hand, but I looked at Siegel (1956) and Ott/Longnecker (2001); notations differ, but results are the same.

Notes:

  • If one is sampling from continuous distributions, then in theory there will be no $0$'s to delete, and there will be no ties among the $|d_i|,$ but in practice one must round to some number of digits and occasional ties may occur. Traditionally, texts provide tables of critical values (at various significance levels) for 'small' $N_r,$ and suggest normal approximations based on the mean and variance of their test statistic for 'large' numbers of pairs.

  • Software packages differ in their treatments of ties--all the way from (a) ignoring them, using the normal approximation, and giving a warning message, through (b) giving a simulated P-value that matches the situation at hand.

  • Information is lost when data are reduced to ranks. Often, a better choice than the Wilcoxon signed-rank test is to use a permutation test, or to use a paired t test if the nature of the data permit.

  • The Wilcoxon signed-rank test (for paired data) should not be confused with the Wilcoxon rank-sum test, equivalent to the Mann-Whitney test (for two independent samples). In R statistical software the function wilcox.test does either the paired test or the two-sample test, depending on the parameters used.