Adjusting regression for small sample bias

53 Views Asked by At

I have a set of data points $\{x_i\}$. These data points are grouped so that (say) $i\in\{1,2,3\}$ is group $A$, $i\in\{4,5,6,7\}$ is group $B$, etc.

I would like to test the null hypothesis of no linear relationship between group means and group variances.

A naive approach would be to calculate, for each group, the sample mean $\overline{x}_g$ and sample variance $S^2_g$, then regress $$ \overline{x}_g = \beta S^2_g + \varepsilon_g $$ This doesn't work, since in small samples, $\hat{\beta}$ will be a biased estimate of $\beta$. For example, if $\beta=0$ but the $\{x_i\}$ are very positively skewed, a spuriously high $\hat{\beta}$ will be estimated. This is because

  • Groups that do not contain positive outliers will have low $S^2_g$ and low $\overline{x}_g$
  • Groups that do contain these outliers will have high $S^2_g$ and high $\overline{x}_g$.

Is there a general technique for solving my problem? It seems possible to fix the bias by bootstrapping, but the resulting estimator seems far from efficient.