Finding best predictors of a classification function

127 Views Asked by Bumbble Comm At 28 Mar 2026 - 4:32

I have a large dataset where each element has a number of "input" categories that are either present or not (or if you like, true or false, 1 or 0 etc). Each one also has an output category, again a binary.

A simplified version of this would be the following set:

Rained_yesterday,Rained_more_than_10mm_yesterday -> Raining_today
Is_odd_date,Is_summer -> Raining_today
Is_odd_date,Is_summer -> ()

From this dataset I want to find the categories that best explain/predict the output, starting with the most significant one then following with the next most significant taking into account that the first one has already been used. In a more realistic version of the dataset above for example, the outcome might be:

Rained_yesterday
!Is_summer
Rained_more_than_10mm_yesterday
Is_odd_date

Note that I need to be able to also detect the negation or absence of a category as a predictor, and "Rained_more_than_10mm_yesterday" is likely to be ranked lower as it is strongly correlated with "Rained_yesterday". Ideally I would also like to be able to show that using the top n predictors, I can account for x % of all decisions.

Any pointers to algorithms to use, articles to read etc to help me get started on this would be appreciated.

Original Q&A

There are 1 best solutions below

Bumbble Comm On 09 Aug 2014 - 7:06 BEST ANSWER

You can use logistic regression, which is typically used to generate a multivariable model for the prediction of a dichotomous outcome. Independent variables in logistic regression can be continuous or categorical. Using a "stepwise" procedure, most statistical softwares provide a model including only significant predictors. Looking at the odds ratios given for each predictor, you can also understand which predictors have a more evident independent impact on the outcome.

Finding best predictors of a classification function

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions