How to compare training and test errors in statistics?

783 Views Asked by At

I have a data set and I need to compare the performance of various statistical models: Least Squares, LASSO, Ridge Regression, to name a few of the key ones.

What are standard techniques for comparing the performance of these tools in their ability to perform prediction based on a set of data?

1

There are 1 best solutions below

0
On BEST ANSWER

From what I learned, the common method is to calculate either the Mean Square Error or Root Mean Square Error for both the training and test sets. A method such as Least Squares has low variance, but potentially high bias. Therefore, Least Squares has low MSE for the training data compared to a method such as LASSO (where lambda was picked via cross validation). However, once examining the MSE for the test data, LASSO would likely have a lower MSE. Therefore, the Least Square method overfit the data.

A further analysis can be performed by breaking the MSE into components of variance and bias.