Unequal Sample Sizes Heteroskedasticity Two Factors Using R

Question

Unequal Sample Sizes Heteroskedasticity Two Factors Using R

26 Views Asked by Bumbble Comm At 03 Apr 2026 - 9:42

I have heteroskedastic data of unequal sample sizes and would like to run a two way welch ANOVA.

1.) Is this appropriate? Why or why not?

2.) How do you do this in r?

3.) What are other ways of dealing with this situation?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

(1) If group variances differ, it is probably better to use the Welch ANOVA than the standard ANOVA. For a two-sample t test, it is clear from many simulation studies that the Welch test is better than the 'pooled' test. For an ANOVA with only three treatment groups, there are many simulation studies to do. In my view, not enough of them have been done to be sure yet whether the Welch ANOVA should be used as the default method.

(2) See brief demo below. More extensive demonstrations on various Internet sites show more detail, including diagnostics and multiple-comparison procedures.

(3) Depending on the nature of the data, variances might be made more nearly the same by transforming the data. Two examples: if data are exponential, using logs of the data tends to make variances more nearly equal; if data are Poisson, taking square roots of the counts makes variances more nearly equal, but multiple comparisons are not straightforward, and it is not clear that the transformation gives better power.

Illustration of Welch test in R for a one-factor ANOVA design. Heteroscedastic data. For the Welch test notice that denominator df for the F-test is about 18, not 27. For the particular simulated data used, there is little difference in the P-value.

# Simulated data: 3 groups, 10 replications per group
set.seed(1214) # use same seed for same data
x1 = rnorm(10, 100, 15);  x2 = rnorm(10, 105, 20);  x3 = rnorm(10, 110, 15)
x = c(x1, x2, x3);  gp = as.factor(rep(1:3, each=10))

# Welch ANOVA
oneway.test(x ~ gp)

    One-way analysis of means (not assuming equal variances)

data:  x and gp
F = 3.8698, num df = 2.00, denom df = 17.91, p-value = 0.0401


# standard ANOVA   
> summary(aov(x ~ gp))
            Df Sum Sq Mean Sq F value Pr(>F)  
gp           2   2129  1064.7    3.63 0.0402 *
Residuals   27   7919   293.3                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Unequal Sample Sizes Heteroskedasticity Two Factors Using R

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in VARIANCE

Trending Questions

Popular # Hahtags

Popular Questions