I am working on a data set with two variables; age (x) and wage (y). I created a series of polynomial regression models to fit to the data and I am attempting to identify which model is best (and simplest) in predicting wage using age. I have learned that one way to determine this is to run an ANOVA on each model and interpret the corresponding p value/F statistic for each model. After running the ANOVA, I intuitively select models 2 and perhaps 3 as the models which meet my selection criteria. However, one of my reference sources suggests that model 4 would be the most appropriate model. Here is the output from the ANOVA:
Res.Df RSS Df Sum of Sq F Pr(>F)
1 2998 5022216
2 2997 4793430 1 228786 143.5931 < 2.2e-16 ***
3 2996 4777674 1 15756 9.8888 0.001679 **
4 2995 4771604 1 6070 3.8098 0.051046 .
5 2994 4770322 1 1283 0.8050 0.369682
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
I don't understand why model 4 would be the best model in this instance, when there are models with higher F statistics and lower p values to select from. Would this have anything to do with model variability? I'm clearly missing something in my understanding and am hoping someone from the community can help. I realize this may come across as a machine learning problem and apologize if this is the incorrect section to post in.
Many thanks for your help all.