Machine learning: overfitting phenomena

550 Views Asked by At

Please explain why we should avoid "overfitting phenomena" in training a learning model and how to detect it?

2

There are 2 best solutions below

0
On

While i fully agree with the previous comments, for example that book on statistical learning is great and if you google around you will find some really nice accompanying videos from the authors as well...I would just add this short thought to get you started - ideally one wants to train a model which would be good in predicting general cases - the generality is what makes it so useful right? Now, having a model which is overfitted or over-trained basically means that it is good in predicting only particular cases. Now detecting it, depends strongly on what are you trying to do, there is no simple general look up table.

0
On

You can detect over-fitting by comparing training and test performance. For instance, you can divide the data $X$ into training and test sets and compute a loss function as the number of training samples increases. The distance between training and test loss will be proportional to the amount of over-fitting.

In order to avoid over-fitting (and do well on out-of-sample or test data), it is common to use cross-validation for model parameter tuning. Cross-validation divides the data into K folds and uses K-1 for training and 1 for testing averaging results over K iterations. This helps avoid over-fitting and selects the right parameters for the model.