Turn the uncorrelated variables into correlated ones using covariance matrix.

76 Views Asked by At

https://scipy-cookbook.readthedocs.io/items/CorrelatedRandomSamples.html

I am wondering if there's a way to turn uncorrelated variables into correlated ones using covariance matrix.

The link above does so using the Cholesky decomposition for the normal distribution. But should it work for non-normal distributions such as poisson, uniform, etc?

1

There are 1 best solutions below

0
On

You can use a Gaussian copula approach to combine a correlation-based dependence structure with a (pretty much) arbitrary collection of marginal distributions. This works best for continuous marginal distributions but can probably be adapted for discrete/mixed distributions too.

Let's suppose you have $n$ marginal distributions with cumulative distribution functions $F_1,\ldots,F_n : \mathbb{R}\rightarrow[0,1]$, and an $n\times n$ correlation matrix $\Sigma$. You can generate a random sample of $n$ variables with the given marginal distributions and dependence structure based on $\Sigma$ as follows:

  1. Sample a random vector ($X_1,\ldots,X_n$) from a multivariate standard normal distrbution with correlation matrix $\Sigma$. (You already know how to do this.)
  2. For $i=1,\ldots,n$ compute $Y_i=F^{-1}_i(\Phi(X))$, where $F^{-1}_i$ is the $i$th inverse cumulative distribution function and $\Phi$ is the standard normal cumulative distribution function.

The $Y_i$ computed in this way will be have marginal distribution $F_i$ and will have dependence structure given by the correlation matrix $\Sigma$.

Notes:

  • You will need to know how to compute the inverse cumulative distribution functions $F^{-1}_i$ for your chosen distributions, but that's a separate problem.
  • The dependence structure of the $Y_i$ is determined by $\Sigma$, which is the correlation matrix of the $X_i$, but that's not the same as saying that the correlation matrix of the $Y_i$ is $\Sigma$. In general, this will will not be the case. Indeed, depending on the marginal distributions $F_i$, the correlation of the $Y_i$ might not even be defined.