Classification: Why k-Nearest Neighbor method is more appropriate for a Mixture of Gaussians?

1.5k Views Asked by At

I'm reading a book named "The Elements of Statistical Learning" in which it states 2 scenarios when we are trying to predict the class label:

Scenario 1: The training data in each class were generated according to two bivariate Gaussian distribution with uncorrelated components and different means.

Scenario 2: The training data in each class came from a mixture of 10 low-variance Gaussian distributions, with individual means themselves distributed as Gaussian.

It's said that the Least Squares methods are more appropriate for Scenario 1, while k-nearest neighbor is more appropriate for Scenario 2. But I don't quite understand why.

Could anybody help explain the difference? Any help is greatly appreciated.