wilcoxon rank sum test for two independent samples with ties

637 Views Asked by At

We're comparing times of reaction in a petri dish, with the presence (or not) of a reactive. We suppose that the reactive accelerate the reaction. We have m + n = 4+3 data (in hours)

Xi : 10, 10, 13, 13 and Yi : 7, 10, 9

if we order the data set we have :

7, 9, 10, 10, 10 , 13 , 13 and the associated ranks are 1, 2, 4 , 4, 4, 6,5, 6,5.

Our observed value W0* is 1+2+4 = 7

I have calculated the exact distribution using R : the exact distribution of W* is :

$$\begin{array}{|c|} \text{W*=k} & \text{Pr (W*=k)} \\ \hline \ 7 & 3/35\\ \hline \ 9 & 3/35 \\ \hline \ 9.5 & 2/35 \\ \hline \ 10 & 3/35 \\ \hline \ 11.5 & 6/35 \\ \hline \ 12 & 1/35 \\ \hline \ 12.5 & 6/35 \\ \hline \ 14 & 1/35 \\ \hline \ 14.5 & 6/35 \\ \hline \ 15 & 1/35 \\ \hline \ 17 & 3/35 \\ \hline \end{array}$$

My question is : with and alpha of 0.10, make a test of H0: Y=d X against the alternative suggested by the considered practical situation, based on the exact distribution of W*.

Then redo the previous test with R and find the answer of "R" by using the normal approximation (with continuity correction) formula under H0:

$$\begin{align} Pr (W*≥c) = 1 - \Phi(\frac{c-n(m+n+1)/2}{{(\sqrt{var(W*)}}}) \end{align}$$

where E[W*] = n(m+n+1) and $$\begin{align} var (W*)= \frac{mn(m+n+1)/2}{12} - \frac{ mn \sum (d_k^{3}-d_k)}{12(m+n)(m+n-1)}\end{align}$$

My attempt : Since the observed value W0* is 1+2+4 = 7 then the exact p-value will be the Pr(W* ≤ 7) using the exact distribution calculated with R, the p-value is 3/35 = 0.08571429. So with an alpha of 0.10, since the p-value is smaller than 0.10 we can reject the null hypothesis. Y doesn't have the same distribution as X.

What I don't understand is the normal approximation

R code :

x<-c(10,10,13,13)

y<-c(7,10,9)

wilcox.test(x,y)

Wilcoxon rank sum test with continuity correction

data: x and y

W = 11, p-value = 0.09548

To calculate p-value using the normal approximation :

Let V be the rank vector :

V = (1, 2, 4, 6.5)

D = (1, 1, 3, 2)

D = number of observation for each rank

E[W*]= (3*8)/2 = 12

Var [W*] = (3*4*8)/12 - 3*4*((1^3-1)+(1^3-1)+(3^3-3)+(2^3-2))/12 = 8- (30/42)

Pr(W* ≤ 7.5) (with continuity correction) = Φ((7.5-12)/sqrt(8-(30/42)) = Φ(-1.667157)= 0.04774162

The p-value calculated with normal approximation is very different from the exact p-value. But with the following test :

wilcox.test(x,y, alternative="g")

Wilcoxon rank sum test with continuity correction

data: x and y

W = 11, p-value = 0.04774

alternative hypothesis: true location shift is greater than 0

The p-value is exactly the one calculated with normal approximation.

I am kind of confused with the hypothesis.

In this case we suppose that the location shift is >0 so X-Δ =d Y

We can see that the mean(x) > mean (y)

I don't understand why the p-value calculated with the normal approximation is far away from the exact p-value : 0.08571429

Any help will be appreciated !

1

There are 1 best solutions below

0
On BEST ANSWER

In addition to the ties, you have the disadvantage of very little data. So no test is going to provide really strong evidence that the distribution from which the x's come has a larger median than the distribution for the y's. Here are some results and accompanying comments:

(1) R statistical software attempts to give a P-valueof a one-sided Wilcoxon signed-rank test using a normal approximation, and gives a warning message that the P-value is not exact:

wilcox.test(x, y, alt="greater")

        Wilcoxon rank sum test with continuity correction

data:  x and y 
W = 11, p-value = 0.04774
alternative hypothesis: true location shift is greater than 0 

Warning message:
In wilcox.test.default(x, y, alt = "greater") :
  cannot compute exact p-value with ties

(2) If all 4 x-values exceeded any of the y-values then the P-value of a one sided test would be $\frac{1}{{7 \choose 4}} = \frac{1}{35} = 0.0286.$ But your data are less extreme than that, so no reasonable method of refining the Wilcoxon P-value can give a value smaller than about 3%.

(3) There are not enough observations to check for normality, and there is certainly no evidence against normality. Thus a t test that the population mean differ is not unreasonable. Both the pooled and Welch (separate variances) versions of the two-sample one-sided t test give P-values about 4%.

(4) A one-sided permutation test of the difference of sample means gave a P-value of $8.57\%.$ [The largest possible difference in means is the observed one, 2.833. Under the null permutation distribution, that difference has probability $3/35 = 0.0857.$ (Any two of three $10$'s can be randomized to the x-group.)]

(5) An ad hoc method to adjust the Wilcoxon test for ties is to jitter the observations to break ties and average several results. (Jittering means to add a small bit of random noise, here less than $\pm 0.4$, to each observation.) I got average P-values between 6% and 7%.

Cherry-picking R's approximate P-value of the Wilcoxon test or the P-value of a t test, you might claim that you have 'suggestive' evidence that x's tend to be larger than y's. Perhaps these are promising enough results to prompt a more extensive experiment. But I see no way to draw a convincing conclusion from the existing data.

If this is an exercise to understand the Wilcoxon test statistic, you have to realize that the statistic itself is the important result. The standard (combinatorial) distribution theory to get a P-value from that statistic does not apply because of the ties. Normal approximations are notoriously unreliable for such small values of $n$ and $m,$ so I am not surprised to see that your normal approximation gives an unsatisfactory value.