Something I don't understand in regression models:
Following the definition of $R^2$, $$R^2=1-\frac{SS_{error}}{SS_{total}}$$ the adjusted R-squared is set as: $$\bar{R}^2=1-\frac{SS_{error}/df_{error}}{SS_{total}/df_{total}}$$ Now, as we have the equivalent definition of $R^2$, $$R^2=\frac{SS_{regression}}{SS_{total}}$$ why don't we use an adjusted R-squared set as: $$\bar{R}^2=\frac{SS_{regression}/df_{regression}}{SS_{total}/df_{total}}$$ A subsidiary question is: is there really equivalence between the two expressions of $R^2$ ? Or should one be preferred to the other? (I didn't find any paper showing who introduced first the $R^2$ and in which form)
Your formula wouldn’t work, because (using your terminology) you’d have:
$\overline{R}^2 = \frac{SS_{regression}/df_{regression}}{SS_{total}/df_{total}} = \left(1- \frac{SS_{error}}{SS_{total}} \right) \frac{df_{total}}{df_{regression}}$
For a simple linear regression with many data points, this would potentially be much larger than 1.
For your subsidiary question, I’m assuming that $SS_{regression}$ refers to what I’ve heard called the explained sum of squares (ESS), and $SS_{error}$ refers to the residual sum of squares (RSS), in which case the equivalences of the formulas comes from TSS = ESS + RSS for a simple linear regression.
There are other formulas for adjusted r-squared, for example:
https://stats.stackexchange.com/questions/48703/what-is-the-adjusted-r-squared-formula-in-lm-in-r-and-how-should-it-be-interpret