Multi-variate Pseudo-random Generation?

57 Views Asked by At

How can one generate pseudo-random data from a multivariate distribution?

I know for uni-variate, one generates uniform data and apply inverse CDF of the distribution. But how about the multivariate?

1

There are 1 best solutions below

3
On BEST ANSWER

One possibility to generate data from a multivariate distribution is to adopt a copula framework: https://en.wikipedia.org/wiki/Copula_(probability_theory) A copula is "just" a multivariate distribution on $[0,1]^d$ with uniform margins.

By Sklar's theorem the joint distribution $H$ can then be written as \begin{align} H(x_1,\ldots,x_d) = C(F_1(x_1),\ldots,F_d(x_d)), \end{align} where $C$ is the copula and $F_1,\ldots,F_d$ are the marginal distributions of $H$. On the other hand, any combination of univariate distributions and a copula in the above fashion yields a valid joint distribution $H$.

If you now have a pseudo-observation from $(U_1,\ldots,U_d)$ from $C$, a sample from $H$ can be generated as $(F_1^{-1}(U_1),\ldots,F_d^{-1}(U_d))$. Here you see the similarity to the univariate case. However, the dependence between $U_1,\ldots,U_d$ will now carry over to $(F_1^{-1}(U_1),\ldots,F_d^{-1}(U_d))$.

At this point we have reduced the general problem of generating pseudo-observations from $H$ to that of generating pseudo-observations from $C$, so you could say we didn't gain much.

However, once you can sample from $C$ you can freely choose the margins (!) meaning that you now have access to a whole family of distributions.

How to actually sample from a copula is another topic and you can find details in books (and papers) on copulas.

In R there (luckily) is the copula package (https://cran.r-project.org/web/packages/copula/index.html) that has simulation algorithms for most popular copula families.