Linear regression and expectation value

279 Views Asked by At

so I'm struggling to understand something, perhaps very basic. I'm fairly new to linear regression. Imagine you have a set of data $\sum_i^n (x_i,y_i)$ from n measurements, and they of course follow a linear trend. Now, to figure out the equation of the line that follows this data I use the least-squares method, by taking the equation $y_i = b + m x_i + e_i$, where $e_i$ is the residual.

Another approach I've seen to this is by maximizing the probability of obtaining the set of data, which is proportional to a normal distribution $P = (some constant) exp(-1/2 \sum_i^n ( (y_i - \mu_i) /\sigma_i )^2 $ where latter on $\mu_i$ will become $b + m x_i$. Then, it will be equivalent because to maximize the probability I will have to minimize the argument in the exp function.

So here's my question. How can I properly/formally show that $E[y_i] = b + m x_i$, thus losing the residual? Is this an assumption or can I demonstrate it? Thank you very much.

1

There are 1 best solutions below

0
On BEST ANSWER

In OLS, there's an exogeneity assumption that says that the conditional mean of the residual is $0$, i.e., $E[\varepsilon \ | \ x] = 0$. Then, you have:

$$ E[y \ | \ x] = E[b + mx + \varepsilon \ | \ x] $$ $$ = E[b + mx \ | \ x] + E[\varepsilon \ | \ x] $$ $$ = b+mx .$$

This is true regardless of the distribution of the residuals - you just need the conditional mean to be $0$. If you further assume that, conditional on $x$, the residuals are normally distributed, then the Maximum Likelihood Estimator (MLE) is equivalent to the OLS estimator.