Figure out a threshold percentage for predictions provide by a logistic regression machine.

41 Views Asked by At

I am trying to figure out a particular percentage of confidence so that I can minimize false positives and false negatives and get the most optimal result set.

Here is what I am doing...

I am currently working on a machine learning project. It is a logistic regression model and the response I am getting is categorical in nature. I have two categories rightnow. Lets say 'A' and 'B'.

During testing the machine responds with a percentage of confidence for both the labels. Output Example: A:0.7689 B: 0.2311

I trained my machine on 4000 data items for each classifier and predicted 200 data items (test data) and received the following kind of result:

LevelOfConfidenceRange RightPredictionsinthisRange WrongPredictionsinthisRange

        50%-60%                         2                       3           
        60%-70%                         5                       0           
        70%-80%                         7                       2           
        80%-85%                         4                       1           
        85%-90%                         10                      0           
        90%-95%                         49                      4           
        95%-100%                        109                     4           

Now, I want to figure out a threshold percentage (e.g. 87%) so that I output certain number of predictions in which correct predictions are maximized and wrong predictions are minimized and I still provide a sufficient amount of results. That is I'll be only considering the outputs predicted by the machine if the level of confidence provided by it is greater than Threshold percentage (87%) only. I'll be rejecting the results that have been predicted with a level of cofidence less than Threshold percentage.

Currently I am forming such a table and then deciding my threshold percentage.

I want to know if there is a sophisticated/proper Mathematical method to figure this threshold percentage.

I would appreciate any help!

Regards Savya Saachi