Unsupervised clustering in $10$ dimensions

151 Views Asked by Bumbble Comm At 28 Mar 2026 - 5:05

I have a set of $\sim1000$ feature vectors in $\sim10$ dimensions and would like to cluster them in an unsupervised manner. I am expecting some of the vectors to bunch together in groups, but quite a lot to be outliers that are nowhere near each other (so $\sim5$ meaningful clusters and $1$ cluster which is just a uniform distribution in all dimensions).

I'm thinking of using a Gaussian mixture model; does that sounds reasonable? Is learning a GMM suitable for this higher dimension of data or is there perhaps a more suitable technique? Does $1000$ vectors sound like enough to do $10$-dimensional clustering. I am quite new to it so am trying to get a feel. Thanks very much for any insight you might be able to provide! :)

Original Q&A

There are 1 best solutions below

Bumbble Comm On 14 Aug 2013 - 2:39

Your data are not "high dimensional" (1000x10 is small), but the question you are asking doesn't have a "right" answer. Depending on what you need I would suggest 2 different approaches :

kmeans algorithm http://en.wikipedia.org/wiki/K-means_clustering (eventually kernel kmeans but it's much more involved)
Principal component analysis (http://en.wikipedia.org/wiki/Principal_component_analysis ) or Generalized PCA (http://arxiv.org/ftp/arxiv/papers/1202/1202.4002.pdf) and more generally subspace clustering methods.

Kmeans are probably the easiest out of the box algorithm in your case. The answer depends a lot on what you are trying to achieve.

By the way, your last cluster uniform along all dimensions will be hard to find in an unsupervised manner I think

Unsupervised clustering in $10$ dimensions

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in CLUSTERING

Trending Questions

Popular # Hahtags

Popular Questions