I want to simulate multivariate normal distribution samples with MATLAB to help understand biplot. To be specific, I want to explore how the percent of varaibility explained by the plotted PCs affects the interpretation of biplot.
ps: The biplot is a tool that can show simultaneously the scores and the loadings based on SVD decomposition. Assume $X_{n \times p}$ denotes the observation matrix with each out of $p$ varaibles centered and scaled, the SVD decomposition gives the equation: $X=U \times \Lambda \times V^{T}$. The relation can be also written as: $x_{ij}=\sum_{s=1}^{S}u_{is} \times (\lambda_{s} \times v_{js})$, in which the first term $u_{is}$ is related to the scores, and the term in the brace is related to the loadings. The first term is often plotted as dots to represent the scores, while the second term in the brace is often plotted as arrows to represent variables.
One interpretation of biplot is that the angle between any two varaible arrows can indicate their correlation degree. I guess this implication should depend on the percent of variability explained by the plotted two or three PCs. Therefore, I want to do a simulation to find in which case the interpretation should be taken case of. My first assumption is that the less variability preserved by the plotted PCs, the more risk of the interpretation has. Hence, my simulation should be able to control the percent of variability to be explained by each PC, in other words, the eigen values of varaince-covariance should be under control.
The function 'mvnrnd' is offered by MATLAB for this purpose, which requires mean vector and variance-covariance matrix as input arguments, like:
n = 100; % sample size
ndim = 4; % four properties to simulate
mu = zeros(ndim, 1);
A = rand(ndim, ndim);
sigma = A'*A;
data = mvnrnd(mu, sigma, n);
The mean vector can be easily defined, while definition of the variance-covariance matrix is not that easy. In addition to the necessary condition that the variance-covariance matrix should be semi-definite, I also want the eigen values of the variance-covariance matrix to be arbitrarily defined. I have no idea about how can I define the variance-covariance matrix. Can I do the simulation based on the SVD decomposition?