Mean hypothesis testing of two populations

Question

Mean hypothesis testing of two populations

1.3k Views Asked by Bumbble Comm At 29 Mar 2026 - 7:28

Hi there I have a question as I want to increase my understanding in this topic. If a question is asking me to conducted a hypothesis test around the mean of two populations with variance and the mean unknown (e.g. measuring the effectiveness of medication on patients after and before, with a small sample size would I assume the variances are equal or would I do matched pairs of variance).

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2017-04-02 00:59:09

If you want to use a two-sample T procedure, and you have before-after measurements on the same patients, you should use a paired T-test.

Now, with paired t-tests, we compute a difference variable, and then perform a one-sample inference on that difference variable. As such, there is only one variance, and you don't have to think about pooling variances at all.

Let $X_1 , \ldots, X_{n}$ denote samples from "before" and $Y_1, \ldots, Y_n$ denote samples from "after". Denote the means of before and after as $\mu_x$ and $\mu_y$, respectively. Then we are interested in doing inference on the difference in population means $\Delta = \mu_y -\mu_x$. For example, ,we might test the hypothesis

$$ H_0: \Delta = 0$$ $$ H_A: \Delta \ne 0 $$

Then you create $d_i = Y_i - X_i$. Theoretically, we assume $d_1 \ldots d_n \sim^{iid} N(\Delta, \sigma^2)$. Importantly, notice there is only one variance. Now you treat the $d$'s as your data and use the typical formulae. For example,

$$ t^* = \frac{\bar{d}-\Delta_0}{s_d/\sqrt{n}} $$

where $\Delta_0$ is the value you specify in the null. Under the null, $t^* \sim T_{n-1}$ ( a T distribution with $n-1$ degrees of freedom). Use this to calculate desired p-values.

I tried to give a brief summary and cut right to your answer. It is necessary for your to read your textbook to know how to check assumptions and read some examples.

**Bumbble Comm** · Answer 2 · 2017-04-02 16:58:31

@RMurphy has correctly described a paired t test, which is applicable if the same subjects are used 'before' and 'after', and if the differences $d_i$ in their responses are from an approximately normal population.

However, you mention a small sample of subjects. If $n$ is small and data are markedly non-normal, then the t statistic might not have a t distribution, and you could get a misleading result.

For example, here are differences in paired observations from Bethel et al. (1989), in which $n = 19$ subjects with asthma were measured for SAR (a measure of airway resistance) in pure air and in air contaminated with a small amount of $SO_2.$

Diff:
     0.10    -0.19     0.46    -0.66    -0.92     0.94    -1.13    -1.24    -3.90    -4.99
    -5.20    -5.23    -5.36    -6.01     7.33    -9.00   -12.95    14.79   -18.23

As computed in Minitab 17 software, a paired t test fails to find a significant difference (5% level) in population means under the two test conditions:

One-Sample T: Diff 

Test of μ = 0 vs ≠ 0

Variable   N   Mean  StDev  SE Mean      95% CI         T      P
Diff      19  -2.70   6.99     1.60  (-6.07, 0.66)  -1.69  0.109

By contrast, a Wilcoxon signed-rank test, which does not assume normal data, does find a significant difference (5% level) between medians.

Wilcoxon Signed Rank Test: Diff 

Test of median = 0.000000 versus median ≠ 0.000000

          N for   Wilcoxon         Estimated
       N   Test  Statistic      P     Median
Diff  19     19       43.0  0.038     -2.743

[The reason for listing separately the 'sample size for the test' is that $0$ differences (not present here) would be discarded before testing.]

Another possible nonparametric test for these data is a permutation test, illustrated in Eudey et al. (2010), Sec 3.

You can find a listing of the 'Air' and 'S02' values, from which the differences (shown above) where derived, in the Eudey article, in the original Bethel paper, or in the biostatistics textbook by Pagano and Gauvreau.

Note: You do not say explicitly that the same subjects are used with and without the drug. In case you have two groups of randomly chosen subjects, you would need to do a two-sample test, and you should consider (a) a Welsh separate-variances t test (for normal data), (b) a nonparametric Mann-Whitney-Wilcoxon signed rank test or a permutation test (non-normal data).

Mean hypothesis testing of two populations

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in HYPOTHESIS-TESTING

Trending Questions

Popular # Hahtags

Popular Questions