Regression function - conditional mean

3.1k Views Asked by At

I am trying to understand the statistical fundamentals behind linear regression, and i have never been able to intuitively understand the following; really would appreciate if someone could give an intuitive explanation:

The regression function is the conditional mean of $Y$ over $X$, e.g. $E[Y|X]$.

Instinctively - i am trying to average the $y$-values over a set of $x$-values, but I fail to see how this links to ending up with a linear function, and/or how this links with how you normally would do a regular unconditional average...

Help much appreciated!

1

There are 1 best solutions below

2
On BEST ANSWER

Assume $X$ is height and $Y$ is weight. You take a sample of $n$ people and you measure these two values, i.e. you have a sample $$(X_i,Y_i), \qquad i=1,2,\ldots,n$$ You calculate the regression line $$μ_{Y|X}=b_0+b_1X$$with $X$ as predictor and $Y$ as response variable. Assume know that give a value for $X$ say $x=170$cm and you calculate the LHS quantity which is the conditional mean of $Y$ given $X=170$, i.e. $E[Y|X=170]$ or equivalently $μ_{Υ|Χ=170}$. Say, it's value is $70$kg. It has the following interpretation: a person that has a hight of 170cm, weights in average 70kg. Or equivalently the mean weight of persons that are 170cm high is 70kg. You do not have to take a mean value for different values of $X$. But for each $x$ the mean values of the response $Y$ is on your regression line.

Another example is $X$ square meters of an appartment and $Y$ it's rental price. For any given $x$ (say f.e. $x=100m^2$) the regression line will give the average price of all apartments that are $100m^2$ big.