How can I increase Validation Accuracy when Training Accuracy reached 100%

306 Views Asked by At

Initial Situation

I have a Classification Model which I train on a Dataset consisting of 1400 samples where train on a training set (80%) and validate on another validation set (20%). As Model I use a Neural Network. During training I plot the train- and validation-accuracy curves. Both accuracies grow until the training accuracy reaches 100% - Now also the validation accuracy stagnates at 98.7%.

My Assumptions

I think the behavior makes intuitively sense since once the model reaches a training accuracy of 100%, it gets "everything correct" so the failure needed to update the weights is kind of zero and hence the modes "does not know what to further learn". Nonetheless the validation Accuracy has not flattened out and hence there is some potential to further increase the Validation Accuracy.

My Question

What can I possibly do to further increase the validation accuracy? As a side note: I still implement slight Data Augmentation (slight noise, rotation) on the training set (not on the validation set).

Thanks for your suggestions :)

1

There are 1 best solutions below

0
On

It appears that your network very quickly learns how to classify the data. To check your train/validation errors are not just anomalies, shuffle the data set repeatedly and again split it into train/test sets in the 80/20 ratio as you have done before. If you continue to observe the same behaviour, then it is indeed possible that your model learns very quickly and would continue to improve if only it had more data.

To test that, do a Leave-One-Out-Crossvalidation (LOOC). That is, for each $i=1,\ldots, 1400$ take your test set to be the $i$-th sample, and your training set to be the other $1399$ samples. If the average training accuracy over these $1400$ models is $100$% and the average test accuracy is again very high (and higher than $98.7$%) then we have reason to suspect that even more data would help the model.

Another way to improve the model, related to what you said about your model not knowing "what further to learn" once the training accuracy reaches $100$%, is to add a regularisation term into your error function, so that even when a set of weights gives a training accuracy of $100$%, you can continue to find even simpler weights which also do the same, instead of stagnating.