I have a dataset of two identified populations that contains various parameters for each data point. I would like to find the best criterion, i.e. the relation between e.g. three of those parameters, that best separates those two populations.
Imagine for example that we look at the human population and we would like to find the best relation between parameters such as hight, age and weight to separate men from women.
I would like to do this in Python, but I don't know the correct keywords. I'm pretty sure this is a common optimization problem, so there likely already exists a library out there that can do it.
Populations are usually normally distributed. Therefore, I'd consider the EM algorithm for a mixture of Gaussians.
See: Mitzenmacher and Upfal: Probability and Computing, Cambridge Univ. Press, 2nd edition, 2017 (section 9.7).