Suppose we want to know whether the effectiveness of two methods, methods A and methods B, is comparable. Suppose that the (scientific) criterion for determining whether the methods are identical is that the difference between the means of each method, $mean_A - mean_B$, is within the interval $\mathcal{I} = [LB, UB]$ where $UB > LB$ are set by an expert. One way to test this is to compute a confidence interval $(1-\alpha)$ of the difference $mean_A -mean_B$ using the two-sample t-distribution, with $\alpha \in (0,1)$, and to compare the confidence interval to $\mathcal{I}$.
Let us write the confidence interval $(1-\alpha)$ $[cib, ciu]$, we then have two possibilities:
- $[cib, ciu] \subseteq \mathcal{I}$: we can confirm that the two methods are comparable with respect to our criterion,
- $[cib, ciu] \not \subseteq \mathcal{I}$ : we cannot confirm that the two methods are comparable with respect to our criterion.
The problem starts now, imagine that you have a third method: method C. I have identified two ways to solve the following problem: are the three methods comparable?
The first one, is to use the 2 sample t distribution and find the $(1-\alpha)$-confidence intervals for the three differences $mean_A-mean_B$, $mean_A - mean_C$ and $mean_B - mean_C$. Then apply point 1) and 2) above to each of them.
The second option is to use ANOVA to compute the confidence intervals (thus given by the F-distribution) on each differences. More precisely, you compute $D_{AB} = Data_A- Data_B$, $D_{AC} = Data_A - Data_C$ and $D_{BC} = Data_B - Data_C$, run an ANOVA on $D_{AB}, D_{AC}, D_{BC}$ and use the confidence intervals given (here it is a software that give this intervals when you un an ANOVA test) and compare them to the criterion interval, i.e. apply points 1) and 2) above.
My questions are:
Q1) Should we prefer the first option (t distribution) or the second (F distribution)? (question of power or size of the intervals or maybe other criterions).
Q2) What happen if, for example,$[cib_{AB}, ciu_{AB}] \subseteq [LB, UB]$, $[cib_{AC}, ciu_{AC}] \subseteq [LB, UB]$, but $[cib_{BC}, ciu_{BC}] \not \subset [LB, UB]$? Do we have to consider that we can not conclude on the comparability of the three methods? Or can we just say 'methods A and B are comparable' but nothing else?
Where $[cib_{AB}, ciu_{AB}]$ is the $(1-\alpha)$-confidence interval for the difference $mean_A-mean_B$ and same for the other two cases.
Q3) If we use the first option (t-distribution), do we have to adjust the confidence level since the confidence interval is only about two methods and we are testing three differences simultaneously? This would be similar to apply Bonferroni adjustment (or other methods).
The first thing is that the confidence intervals for the differences $D_{AB},D_{AC}, D_{BC}$ can not be computed using ANOVA since they are dependent data and ANOVA assume independent data.