Background
Let's assume I'm using principal component analysis to carry out clustering of a 2-d data set, using a non-normalized covariance matrix to carry out the operation. I then solve for the eigenvalues, and sort the list of eigenvalues in ascending order (i.e. highest absolute valued eigenvalues are first/top in my list).
Then, I calculate the eigenvectors associated with each eigenvalue. In a simple case with a single cluster of data from a 2-d data set, I expect the two vectors to give me the primary and secondary orientations of an ellipse. Using the centroid of the ellipse, and the orientation, I can generate an ellipse that can represent the data set in a general sense. However, I currently can only use the "mass" of the data set (i.e. sum of weighted data points) to estimate the size/area of the ellipse. I would like to be able to determine the dimensional variance in each dimension, so I could create "confidence ellipses" (i.e. an ellipse that encompasses 68%/1x-std.dev. of all data, another that encompasses 95%/2x-std.dev. of all data, etc.). I then plan to extend this to higher-dimensional data (i.e. 3-d, 4-d, 10-d, etc.).
Question(s)
- Does the magnitude of each eigenvector correlate with its respective eigenvalue (i.e. are the two proportional)?
- Does the magnitude of each eigenvector provide any additional information directly? (i.e. dimensional variance/standard-deviation)? If not, is there a normalization/calculation that can derive dimensional variance from the eigenvalue and/or eigenvector magnitude?
Thank you.
No, the magnitude of eigenvectors does not mean anything, only the direction matters. I think usually we denote the eigenvectors to be the one with length 1 by default.
If $\vec{v}$ is the eigenvector of eigenvalue $\lambda$, it is easy to see that $5\vec{v}$, $-\sqrt{2}\vec{v}$, $10^{28}\vec{v}$ etc. are all eigenvectors of eigenvalue $\lambda$.