Weird definition of the Coefficient of determination

35 Views Asked by At

in my regression course I encountered the definition of the coefficient of determination below. It's in the case of a linear regression $y=X\cdot \beta$, with $\hat{y} $ the projection of $y$ on $span(X)$. Supposedly $$R^2=\frac{|| \hat{y}-\overline{y}1_n||^2}{||y-\overline{y}1_n||^2}$$. I don't understand this definition, since this is supposed to be the empirical correlation coefficient squared, and yet that doesn't seem to be the case at all here. Can someone explain?

1

There are 1 best solutions below

0
On

It is the correlation coefficient $r$ between $\hat y $ and $y$. In the case of a simple linear model, i.e., $y_i = \beta_0 + \beta_1 x_i + \epsilon_i$, $$ r^2_{\hat y, y} = r^2_{x, y}, $$ because the span of $X$ is determined completely by the only $X$ you have. For multiple regression, it is just $r^2_{\hat y, y}$. For the simple model, you can just work it out by replacing $\hat y $ with $\hat \beta_0 + \hat \beta_1 x_i$ in your equation. Then, by simple algebra, you'll arrive at $$ R^2 = \frac{ \hat \beta_1^2 S_x^2}{S_y^2}, $$ then replacing $\hat \beta_1$ with $S_{x,y}/S_x^2$, you'll get $r^2_{\hat y, y} = r^2_{x, y}$.