I have a data set and I need to compare the performance of various statistical models: Least Squares, LASSO, Ridge Regression, to name a few of the key ones.
What are standard techniques for comparing the performance of these tools in their ability to perform prediction based on a set of data?
From what I learned, the common method is to calculate either the Mean Square Error or Root Mean Square Error for both the training and test sets. A method such as Least Squares has low variance, but potentially high bias. Therefore, Least Squares has low MSE for the training data compared to a method such as LASSO (where lambda was picked via cross validation). However, once examining the MSE for the test data, LASSO would likely have a lower MSE. Therefore, the Least Square method overfit the data.
A further analysis can be performed by breaking the MSE into components of variance and bias.