Can we perform t-test, Willcoxon rank sum test based on mean, and std

Question

Can we perform t-test, Willcoxon rank sum test based on mean, and std

69 Views Asked by Bumbble Comm At 13 Apr 2026 - 5:51

I want to compare the performance of two different stochastic methods on a problem. I have the results of method A on the problem on 50 independent different runs. However for method B I only possess the mean and std of 50 different runs. I want to perform t-test and Willcoxon rank sum test on these methods. would the result of these tests, based on mean and std, be reliable and correct?
Also, how these tests can be performed in matlab?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

t test: Yes. Sample sizes, means and standard deviations are sufficient for doing a t test. You already have $n_2 = 50, \bar X_2,$ and $S_2$ for the second sample. Because you have the data for the first sample you have $n_1 = 50$ and you can compute $\bar X_1$ and $S_1.$

Then the pooled t statistic is

$$T = \frac{\bar X_1 - \bar X_2}{S_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}},$$ where $S_p^2 = \frac{(n_1 - 1)S_1^2 + (n_2-1)S_2^2}{n_1 + n_2 - 2},$ and under the null hypothesis $H_0: \mu_1 = \mu_2,$ we have $T \sim \mathsf{T}(\text{df}=n_1+n_2 - 2)$ [Student's t distribution with $n_1 + n_2 - 2$ degrees of freedom]. So, for $n_1 = n_2 - 2,$ you would reject $H_0: \mu_1 = \mu_2$ against the alternative $H_a: \mu_1 \ne \mu_2$ at the 5% level of significance if $|T| \ge 1.984.$ You can get the two-sided critical value $c = 1.984$ from a printed table of Student's t distribution or use software (value from R below):

> qt(.975, 98)
[1] 1.984467

This test assumes that the two populations have (nearly) the same population variances: $\sigma_1^2 \approx \sigma_2^2.$ If you have reason to believe that is not true, you can use the 'Welch separate-variances' t test. For equal sample sizes $n_1 = n_2$ the $T$-statistic is the same as for the pooled test just described, but the degrees of freedom will be smaller than 98 (roughly to the degree that sample variances are not equal). You can get the formula for the degrees of freedom on Wilipedia or in almost any elementary of intermediate level statistics text. [With $n_1 = n_2 = 50,$ it seems likely that the degrees of freedom would be greater than 30, in which case you could use the approximate critical value $c = 2.0$ for a test at the 5% level.]

Wilcoxon rank sum test: No. There is no way to do a two-sample Wilcoxon test without having access to the data for the second group. [Unless the data for the first sample are extremely skewed or have many far outliers, the t test should be OK. (It is typical for normal sample of size 50 to show a couple of moderate outliers.) Of course, it would be nice to be able to look at the data for the second sample, but unless you have reason to suspect otherwise, it seems safe to assume it shares near-normality with sample 1.]

Example: I don't know whether Matlab does t tests, but the computations here are not beyond what you can do on a simple calculator. In R, here is how the pooled and Welch t tests would look for the two fake datasets I have simulated below.

x1 = round(rnorm(50, 100, 15), 2);  x2 = round(rnorm(50, 98, 12), 2)

sort(x1)
 [1]  70.56  71.10  73.87  77.27  78.79  79.90  80.23  81.54  82.35  82.86
[11]  84.18  84.28  85.04  87.33  88.57  90.03  90.64  91.29  91.43  92.07
[21]  92.13  92.29  92.70  95.47  96.93  98.05  98.29  99.41 102.82 102.87
[31] 103.14 103.51 104.13 105.42 106.52 106.95 107.56 107.96 108.11 108.37
[41] 108.60 108.79 113.36 114.76 115.99 117.15 121.29 122.64 128.65 134.88
sort(x2)
 [1]  64.52  65.08  77.61  77.71  78.51  80.74  81.11  81.55  82.78  84.11
[11]  84.73  86.02  87.87  88.15  88.34  92.60  92.76  93.47  93.62  94.81
[21]  94.93  95.14  96.20  96.31  96.96  96.98  97.37  99.33  99.61  99.93
[31] 100.16 100.61 101.88 102.76 103.19 104.24 104.51 105.44 105.45 106.85
[41] 108.05 109.51 111.53 112.80 112.97 113.10 114.00 123.08 124.68 127.06

summary(x1); sd(x1)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  70.56   85.61   97.49   97.64  107.86  134.88 
## 14.96245
summary(x2); sd(x2)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  64.52   87.94   96.97   96.81  105.21  127.06 
## 13.73883

There are slight differences between the two normal samples, but too small to be detected either by the pooled or Welch test (both P-values exceed 0.05; the T statistics are well below 2 in absolute value).

t.test(x1, x2, var.eq=T)  # pooled test

        Two Sample t-test

data:  x1 and x2
t = 0.28788, df = 98, p-value = 0.774
alternative hypothesis: true difference in means is not equal to 0
 95 percent confidence interval:
 -4.873849  6.527849
sample estimates:
mean of x mean of y 
  97.6414   96.8144 

 t.test(x1, x2)

         Welch Two Sample t-test

 data:  x1 and x2
 t = 0.28788, df = 97.295, p-value = 0.7741
 alternative hypothesis: true difference in means is not equal to 0
 95 percent confidence interval:
 -4.874365  6.528365
sample estimates:
mean of x mean of y 
  97.6414   96.8144

Can we perform t-test, Willcoxon rank sum test based on mean, and std

There are 1 best solutions below

Related Questions in STATISTICS

Trending Questions

Popular # Hahtags

Popular Questions