I hope this probably basic level of statistics also passes as mathematics, though the specific example I use is biological. I am confused about the meaning of statistical significance/insignificance when there are multiple parties involved.
Let’s say there are three cancer drugs, A, B, and C. We have three groups of people in the trial, each treated with one of the above drugs. At the end of the trial, we measure the sizes of tumors; the more effective the drug is, the smaller the number is.
We obtained these results:
The average of A is lower than the average of C, and it is statistically significant in t-test, p<0.05. So we think A is a better drug than C. The average of A is also lower than the average of B, but it is NOT statistically significant in t-test: p>0.05. So we think A is not a better drug than C. The average of B is lower than the average of C, but it is NOT statistically significant in t-test: p>0.05. So it seems B is not a better drug than C.
So how can A be not better than B, and B is not better than C, but A is better than C?
Can I conclude that A is better than C indeed?
If an overall one-factor ANOVA (with three levels of the factor) rejects the null hypothesis that not all three population means are equal, there is no guarantee that post hoc analysis will completely resolve which means are equal and which are not.
You have found an instance where the sample means are in the order $\bar X_A < \bar X_B < \bar X_C.$ And you have evidence at a chosen level of significance to conclude that the difference between $\bar X_A$ and $\bar X_C$ is significant. So you suppose that $\mu_A < \mu_C.$
However, you are simply unable to resolve whether the smaller difference between $\bar X_B$ and $\bar X_A$ is significant. Similarly, for the smaller difference between $\bar X_C$ and $\bar X_B.$
So you can only speculate whether $\mu_A < \mu_B < \mu_C$ or $\mu_A \le \mu_B < \mu_C$ or $\mu_A < \mu_B \le \mu_C.$ Worse than that, it could even be that $\mu_B < \mu_A$ or $\mu_B > \mu_C.$ The quantities $\bar X_A,\,\bar X_B,\,$ and $\bar X_C$ are only estimates of $\mu_A,\, \mu_B$ and $\mu_C.$
Be happy for the differences among population means that can be "resolved" (at some level of significance) by ad hoc tests, and try not to be annoyed by non-significant differences.
Example: Here is an example of a one-way ANOVA with five levels of the factor. The main F-test finds significant difference(s) among the levels, but Tukey HSD ad hoc test can resolve only that the two extremes are significantly different (marked by my
#).Simulated data:
ANOVA
Tukey HSD Comparisons.