How to generate a covariance matrix?

1.3k Views Asked by At

So I would like to generate a $50\times50$ covariance matrix for a random variable $X \in \mathbb{R}^{50}$ given the following conditions:

  1. one variance is 10 times larger than the others
  2. The parameters of $X$ are only slightly correlated

Is there a way of doing this in Python/R etc? Or is there a covariance matrix that you can think of that might satisfy these requirements?

Thank you for your help!

1

There are 1 best solutions below

2
On BEST ANSWER

Comment continued: I don't know if this is of any help, but you mentioned R in your Question. Here is an R program that generates fake data and then finds the sample variance-covariance matrix.

I use $5$ instead of $50$ variables to save space, but the idea is the same. At the start, the respective population variances of the five variables are $10, 1, 1, 1,$ and $1,$ and the sample variances (for $n = 10,000$ observations per variable) are nearly the same.

The same noise vector with very small variance is added to each of the variables, thus simulating small (mainly positive) sample correlations among the variables. (Of course, this adds a little to the population variances of the variables.)

The resulting variance-covariance matrix may give you an idea what you need to do.

set.seed(1234)  # set a different seed for a fresh simulation
n = 10^4
X = matrix(0, ncol=5, nrow=n)
noise = rnorm(n, 0, .01)    
X[,1] = rnorm(n, 0, sqrt(10)) + noise
for(i in 2:5){
  X[,i] = rnorm(n, 0, 1) + noise }
var(X)
##             [,1]         [,2]         [,3]         [,4]         [,5] 
##[1,] 10.120852001 -0.003395094  0.032612955 -0.066317355 -0.017420682
##[2,] -0.003395094  0.998067921  0.002402681 -0.013012347  0.031856352
##[3,]  0.032612955  0.002402681  1.009511299 -0.001593524  0.001492451
##[4,] -0.066317355 -0.013012347 -0.001593524  1.009853838  0.024616714
##[5,] -0.017420682  0.031856352  0.001492451  0.024616714  0.988864284