Calculating the error term for $Y = \beta X + \epsilon$, at fixed power

81 Views Asked by At

Say I have a simulated linear model where $Y = \beta X + \epsilon$. I want to calculate the error variance $\sigma_{\epsilon}$ of the model for a fixed power (say 80% -- meaning that we also fix $N$, $\alpha$ (the significance level), and $u$ (the number of active $X$s)).

It seems that the calculation should proceed through the 'effect size' $f^2 = \frac{R^2}{1-R^2}$ term in the power calculation for a linear model. And hence through $R^2$, which depends on the variance already in $Y$ (before error is added to the model), and also the residuals after the error is added.

However, after much algebra, I'm not sure I'm getting the correct solution. Letting $\sigma_{Y_b}$ represent the variance in $Y$ BEFORE adding $\epsilon$ (and using an R package calculate $f^2$ from the other fixed parameters) I get:

$$ \sigma_{\epsilon} = \frac{\sigma_{Y_b}}{1 - f^2} $$

Can anyone second or correct this finding, with clean algebra?

1

There are 1 best solutions below

0
On BEST ANSWER

Doing the algebra a bit slower, I believe I found the correct solution:

  1. For a linear model with a single outcome $ y = \beta X + \epsilon$, we know that $R^{2}$ (the 'proportion of variance explained' by the model) is calculated from the total and residual sums of squares: $$ R^{2} = 1 - \frac{SS_{res}}{SS_{tot}} $$ and that a linear model's `effect size' $f^{2}$ is $$ f^{2} = \frac{R^2}{1-R^{2}}$$ Thus, we can see $$ f^{2} = \frac{SS_{tot}}{SS_{res}} - 1$$

  2. We also know that the total sum of squares ($SS_{tot}$) is proportional (by sample size) to the variance of y ($\sigma^{2}_{y}$), since $$ SS_{tot} = \sum_{i} (y_{i} - \overline y)^{2} = (n-1) \cdot \sigma^{2}_{y}$$ and further that $\sigma^{2}_{y}$ (for $y = \beta X + \epsilon $) can be expressed as the sum of variances arising separately from $\beta \cdot X$ and $\epsilon$ (because there should be 0 covariance between them), that is, as a sum of the variances in $y$ 'before' and 'after' error is added to the model: $$\sigma^{2}_{y} = \sigma^{2}_{\beta X} + \sigma^{2}_{\epsilon}$$ hence $$ SS_{tot} = (n-1) (\sigma^{2}_{\beta X} + \sigma^{2}_{\epsilon})$$

  3. Lastly, we know that the sum of square residuals is $$SS_{res} = \sum_{i} \epsilon_{i}^{2} $$ which is similarly proportional (by sample size) to $\sigma^{2}_{\epsilon}$, the variance of the error term, i.e. $$SS_{res} = (n-1) \cdot \sigma^{2}_{\epsilon} $$

  4. Putting all of these together, we get \begin{align*} f^{2} & = \frac{(n-1)(\sigma^{2}_{\beta X} + \sigma^{2}_{\epsilon})}{(n-1) \cdot \sigma^{2}_{\epsilon} } - 1 & = \frac{\sigma^{2}_{\beta X}}{\sigma^{2}_{\epsilon}} \end{align*} and thus $$ \sigma^{2}_{\epsilon} = \frac{\sigma^{2}_{\beta X}}{f^{2}} \; \; \; \blacksquare $$