feature selection for continuous variables

381 Views Asked by At

I wonder how exactly "feature selection" should be performed in case of continuous feature values. When feature values are discrete it is very straitforward to apply feature selection, but what to do when you have term-document matrix with (tf*idf|tf|idf) as feature values for text classification task. I don't think that it's correct to take ~20% highest tf values, because it's biased towards features that appear in long documents.

In short, what's simple way to evaluate feature selection for continuous feature values.