A machine learning model gets an accuracy of 90% on a dataset with 90% positive class and 10% negative class. Can we conclude that the model is a good classifier of the data?
Machine learning model accuracy
1.3k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 4 best solutions below
On
Hint. The question behind the question is this:
Does there exist a completely useless classifier that still acheives a $90\%$ accuracy?
If there is such a classifier, then we can't conclude from the given information that our classifier is any good. If there is no such classifier, then our classifier must at least be doing something right.
On
Consider an example that we have a dataset that which has 90 examples of class A(say positive class) and 10 examples of class B(say negative class). Then we can make a "dumb" model that always say Class A, as prediction on training data, then we get the accuracy of 90% on training data which is naive and "dumb" prediction, thus accuracy of 90% in a data set containing 90% class A is not that good. So our model is not doing a great thing, and hence model is not a good classifier.
I hope that helps.
On
A good measurement for this is the confusion matrix. https://en.wikipedia.org/wiki/Confusion_matrix
If you have to test to predict a rare disease with people and only 1% has this disease. If a classifier would predict everyone NOT to have the disease, it would be 99% accurate. A common misunderstanding in statistics.
If it is an classification problem to choose whether the sample is in the positive class or not, the classifier of accuracy 90% is not quite good. Consider the trivial classifier, declaring any sample to be positive, which provides 90% accuracy.
I think that imbalanced dataset is the appropriate keyword.