Why does the distribution of the error term in a regression model matter?

39 Views Asked by At

When we want to build a regression model $Y = f(X)+\epsilon$, why does the distribution of $\epsilon$ matter at all? Why don't we go ahead and find the best $f(X)$ that optimizes a performance measure (for example, minimizes the MSE)? What are the implications of assuming that $\epsilon$ has a normal, Poisson (when it is a count model), or any other distribution?