OLS estimators in econometrics.

55 Views Asked by At

Please explain the rationale behind minimizing the sum of square of difference between the individual y (dependent variable) and the estimate of conditional mean of y. The estimator gives the estimate of conditional mean not individual y so why are we trying to make it close to the individual value. I am not asking for the explanation of why we take squared sum.

1

There are 1 best solutions below

0
On

" I am not asking for the explanation of why we take squared sum."

But this is exactly the point. BTW, I guess you meant "sum of squares" $\sum w_i ^2$, and not "squared sum" $(\sum w_i)^2$. Anyway, as you've mentioned, you are interested in the estimation of the conditional mean (expectation), i.e., $$ \mathbb{E}[Y|X]. $$
But, actually why? This is because $\mathbb{E}[Y|X]$ is the function $g(X)$ that minimizes the mean squared error, $$ \min\mathbb{E}((g(X) - Y |X)^2 . $$ Now you can assume that $\mathbb{E}[Y|X]$ is linear, or just take a linear approximation of it. Namely, you are interested in find the $a$ and $b$ that minimize $$ \min_{a,b}\mathbb{E}((aX+b - Y |X)^2 . $$

However, this is with respect to the populations' distribution. So, what would be the straight-forward sample analog? The empirical MSE, i.e.,

$$ \min_{a,b}\frac{1}{n}\sum_{i=}^n(ax_i+b - y_i)^2 , $$ that is, by minimizing the "individual" square differences you are basically estimating the coefficients of a linear conditional mean.