I have a dataset with power-low distribution, I would like to measure some kind of correlation/mutual information between the data to its class,(for feature selection task), Can I use mutual-information in that case ? If not, what other measures can I use? Any reference to useful info will also be appreciated.
thanks