Why is the slope of regression line always smaller than the slope of SD line?

468 Views Asked by At

I'm learning linear regression and according to my learning materials the slope of regression line equals $r\frac{\sigma_y}{\sigma_x}$ where $-1\leqslant r \leqslant 1$ is the correlation coefficient and $\frac{\sigma_y}{\sigma_x}$ is the slope of the SD line. Since r is always less or equal than 1, the increase of x by 1 $\sigma_x$ can cause at most 1$\sigma_y$ increase in y but why is it so? Why the relationship between two variables cannot be arranged in such a way that increasing x by 1$\sigma_x$ would cause 2$\sigma_y$ increase in y? Thank you in advance for your help.

1

There are 1 best solutions below

0
On

The variance of $y$ can be decomposed in an explained part $\sigma^2_{Y,e}=\dfrac{\text{cov}_{XY}}{\sigma^2_X}=\rho_{XY}\dfrac{\sigma^2_Y}{\sigma_X^2}$ and a residual, unexplained part, $\sigma^2_{Y,u}=(1-\rho_{XY})\dfrac{\sigma^2_Y}{\sigma_X^2}$.

The regression line appropriately uses the explained part, which reflects the linear model and minimizes the residual error.

The SD line (which does not appear to be a common concept) integrates the unexplained variance and for this reason has little use. Other lines with "forced" slopes would be of even less use.