Value Analogous to R² Value when forcing line-of-best-fit through origin

45 Views Asked by At

A group of us college students are tackling the issue of the R² value not being accurate with a linear regression forced through the origin. We are trying to come up with ideas for a metric comparable to use in the specific case of a linear regression forced to have a y-intercept of 0.

We came up with the idea of measuring the angle between the best-fit line and the line forced through the origin, then comparing that to the R² of the best-fit line to get a value that measures the quality of the second line.

Many of us are not math majors, so before submitting our idea anywhere, we figured we’d ask some people who know what they are doing to criticize our idea. Any feedback is appreciated.

Thanks.

1

There are 1 best solutions below

0
On

This is too long for a comment, hence provided as an answer.

Before considering an analogous metric, it is good to keep in perspective how $R^2$ is defined. In my opinion it is best to view from the variance components breakdown. Consider the response variable $y$ and suppose we had no additional information (i.e., no predictors available). Then, the best (in "some sense") predicted value of $y$ is $\bar{y}$. Now, if we brought in the predictor variable information and built the model, we want to know what we have attained. $R^2$ looks at the ratio of how the predicted $\hat{y}_i$'s are spread off of $\bar{y}$ to how the observed $y_i$'s are spread off of $\bar{y}$ (ratio of "sum of squares of regression" to "total sum of squares").

That is,

$$R^2 = \dfrac{\sum (\hat{y}_i - \bar{y})^2}{\sum (y_i - \bar{y})^2} $$

Now think about how this ratio should (could) be adjusted if the y-intercept is forced to be 0.