Normalizing multiple different features from unknown distributions

87 Views Asked by Bumbble Comm At 10 May 2026 - 8:27

I'm doing some "exploratory" data analysis over a large set of classes/proteins, with a few hundred different features (I.E. Continuous variables) extracted from the data. The features are calculated by different criteria (Letter frequency, letter group frequencies, physiochemical parameters, protein length, etc'), and there's no reason to assume that any feature has a normal distribution (but I don't know what sort of distribution it might have).

My goal is to normalize the features, so I can use the features to discriminate between different classes, using machine learning/python/matlab most likely. (For that I need to normalize the features).

So, what's the best way to normalize the features for the different groups? (Standard normalization, i.e sample-mean/Var , doesn't seem appropriate, since the underlying distribution(s) may be non normal, and dividing into percentiles loses a lot of information).

Thank you very much, and I apologize if this is trivial.

Original Q&A

Normalizing multiple different features from unknown distributions

Related Questions in STATISTICS

Related Questions in NORMED-SPACES

Related Questions in STATISTICAL-INFERENCE

Related Questions in MACHINE-LEARNING

Related Questions in BIOLOGY

Trending Questions

Popular # Hahtags

Popular Questions