The regression line, passing through the point of averages with a slope equivalent to r, is said to be a good estimate of the average value of y for each value of x.
I can see why this is the cases when r = 1,0 and -1. When r=1, all points lie on a line. SD increases in equal proportions. Likewise for -1, they have an inverse relationship. For r=0, there is no correlation, so on average, an increase in x will have no effect on y.
But what about the values in between? I am using Freedman's Statistics textbook, and it mentions that while r is the correct factor to use, for values in between 1 and -1, a "complicated mathematical argument is needed". What is this argument?
Pearson's correlation coefficient between $X$ and $Y$ is given by $$ \rho = \frac{cov(X,Y)}{\sigma_X \sigma_Y}. $$ In the simple linear regression model $y=\beta_0 + \beta_1x + \epsilon$, the slope, $\beta_1$, has the following form/interpretation $$ \beta_1 = \frac{cov(X,Y)}{\sigma^2_X}, $$ hence, $$ \beta_1 = \frac{cov(X,Y)\sigma_Y}{\sigma_X \sigma_X \sigma_Y}=\rho\frac{\sigma_Y}{\sigma_X}. $$ So the rate of change in $y$ as a function of $x$ depends on the proportion of the corresponding standard deviation, $$ \frac{\partial}{\partial x}y=\beta_1 = \rho\frac{\sigma_Y}{\sigma_X}. $$