Effect on $R^2$ squared of an additional regressor

44 Views Asked by At

I'm currently dealing with the simple linear regression model and the book I'm studying with says that, any time you add a regressor to your model, even if irrelevant, the coefficient of determination

$$R^2 = \frac{ESS}{TSS}$$

Increases necessarily. Why is that the case?

Thanks in advance, peace.

1

There are 1 best solutions below

0
On BEST ANSWER

Another way to write $R^2$ is $$R^2 = 1 - \frac{\text{RSS}}{\text{TSS}}$$ where $$\text{RSS} = \sum_{i=1}^n (y_i - \hat{\beta}^\top x_i)^2 = \sum_{i=1}^n (y_i - \sum_{j=1}^d \hat{\beta}_j (x_i)_j)^2$$ is the residual sum of squares, which is the sum of the squares of the errors of your least squares fit $\hat{\beta}$.

If we show that adding a regressor necessarily decreases [or stays the same], then $R^2$ necessarily increases [or stays the same].

The idea is that $\hat{\beta}$ is defined to minimize the RSS. That is, $$\text{RSS} = \min_\beta \sum_i (y_i - \sum_{j=1}^d \beta_j (x_i)_j)^2.$$ If we add another regressor, then you would instead solve the problem $$\min_\beta \sum_i \left[y_i - (\beta_{d+1} (x_i)_{d+1} + \sum_{j=1}^d \beta_j (x_i)_j)\right]^2.$$ This value is necessarily smaller than the previous one because if you add the restriction $\beta_{d+1}= 0$ you obtain the original problem; thus adding a regressor ($\beta_{d+1} \ne 0$) can only help you find something smaller.