Need help understanding lack of fit for simple linear regression.

55 Views Asked by At

my book says that a lack of fit test measures whether your linear regression model is "appropriate" for your data AKA is it a "good model" for your data?. Does this mean that lack of fit measures whether your model accurately describes the relationship between the independent variable and dependent variable? What does lack of fit mean? Specifically in terms of simple linear regression? What do they mean by "appropriate" and "good model"? I'm having difficultly understanding it.

1

There are 1 best solutions below

0
On BEST ANSWER

Let's say you perform some regression and get some $\hat y(x)$ values. These are your predictions based on some model you assumed. Now the real values corresponding to the x values are $y(x)$. Now we will have some error between what your model says and the real world. These are measured by some loss function over: $\hat y(x)-y(x)$. Say you use a squared loss. Then you have a total error $\sum\limits_x\left(\hat y(x)-y(x)\right)^2$.

Now some of this loss may be a "pure loss". For example, if for the same x value, you have 2 y values, then at best you can predict the mean between these. No model can get rid of such errors. Such errors are called "pure error". The rest may be termed a model error, or "lack of fit"

This is well explained here. If you get large model errors, you may want to change your model from linear to something more appropriate (possibly quadratic or some higher order polynomial)