I am fitting a Gaussian Mixture Model to high-dimensional data (40 dimensions).
I have trained the model using EM, learned the parameters and now I want to know quantitatively:
What is most important in capturing the structure of the data, the means or the covariance matrices?
Currently, I can think of measuring the Euclidean distance between different means or the cosine of the principal eigenvectors of the different covariance matrices to measure if the direction of variability each covariance matrix captures is similar or different to the rest.
Any ideas ?
Look into model-based clustering research by Adrian Raftery:
http://www.stat.washington.edu/raftery/Research/mbc.html
Raftery's principal concern is devising methods for identifying the component distributions of Gaussian mixtures. He provides a multitude of tools useful for the task you are describing, many of which are available in public R packages.