Hi there I have a question as I want to increase my understanding in this topic. If a question is asking me to conducted a hypothesis test around the mean of two populations with variance and the mean unknown (e.g. measuring the effectiveness of medication on patients after and before, with a small sample size would I assume the variances are equal or would I do matched pairs of variance).
Mean hypothesis testing of two populations
1.3k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
@RMurphy has correctly described a paired t test, which is applicable if the same subjects are used 'before' and 'after', and if the differences $d_i$ in their responses are from an approximately normal population.
However, you mention a small sample of subjects. If $n$ is small and data are markedly non-normal, then the t statistic might not have a t distribution, and you could get a misleading result.
For example, here are differences in paired observations from Bethel et al. (1989), in which $n = 19$ subjects with asthma were measured for SAR (a measure of airway resistance) in pure air and in air contaminated with a small amount of $SO_2.$
Diff:
0.10 -0.19 0.46 -0.66 -0.92 0.94 -1.13 -1.24 -3.90 -4.99
-5.20 -5.23 -5.36 -6.01 7.33 -9.00 -12.95 14.79 -18.23
As computed in Minitab 17 software, a paired t test fails to find a significant difference (5% level) in population means under the two test conditions:
One-Sample T: Diff
Test of μ = 0 vs ≠ 0
Variable N Mean StDev SE Mean 95% CI T P
Diff 19 -2.70 6.99 1.60 (-6.07, 0.66) -1.69 0.109
By contrast, a Wilcoxon signed-rank test, which does not assume normal data, does find a significant difference (5% level) between medians.
Wilcoxon Signed Rank Test: Diff
Test of median = 0.000000 versus median ≠ 0.000000
N for Wilcoxon Estimated
N Test Statistic P Median
Diff 19 19 43.0 0.038 -2.743
[The reason for listing separately the 'sample size for the test' is that $0$ differences (not present here) would be discarded before testing.]
Another possible nonparametric test for these data is a permutation test, illustrated in Eudey et al. (2010), Sec 3.
You can find a listing of the 'Air' and 'S02' values, from which the differences (shown above) where derived, in the Eudey article, in the original Bethel paper, or in the biostatistics textbook by Pagano and Gauvreau.
Note: You do not say explicitly that the same subjects are used with and without the drug. In case you have two groups of randomly chosen subjects, you would need to do a two-sample test, and you should consider (a) a Welsh separate-variances t test (for normal data), (b) a nonparametric Mann-Whitney-Wilcoxon signed rank test or a permutation test (non-normal data).
If you want to use a two-sample T procedure, and you have before-after measurements on the same patients, you should use a paired T-test.
Now, with paired t-tests, we compute a difference variable, and then perform a one-sample inference on that difference variable. As such, there is only one variance, and you don't have to think about pooling variances at all.
Let $X_1 , \ldots, X_{n}$ denote samples from "before" and $Y_1, \ldots, Y_n$ denote samples from "after". Denote the means of before and after as $\mu_x$ and $\mu_y$, respectively. Then we are interested in doing inference on the difference in population means $\Delta = \mu_y -\mu_x$. For example, ,we might test the hypothesis
$$ H_0: \Delta = 0$$ $$ H_A: \Delta \ne 0 $$
Then you create $d_i = Y_i - X_i$. Theoretically, we assume $d_1 \ldots d_n \sim^{iid} N(\Delta, \sigma^2)$. Importantly, notice there is only one variance. Now you treat the $d$'s as your data and use the typical formulae. For example,
$$ t^* = \frac{\bar{d}-\Delta_0}{s_d/\sqrt{n}} $$
where $\Delta_0$ is the value you specify in the null. Under the null, $t^* \sim T_{n-1}$ ( a T distribution with $n-1$ degrees of freedom). Use this to calculate desired p-values.
I tried to give a brief summary and cut right to your answer. It is necessary for your to read your textbook to know how to check assumptions and read some examples.