How can one generate pseudo-random data from a multivariate distribution?
I know for uni-variate, one generates uniform data and apply inverse CDF of the distribution. But how about the multivariate?
How can one generate pseudo-random data from a multivariate distribution?
I know for uni-variate, one generates uniform data and apply inverse CDF of the distribution. But how about the multivariate?
Copyright © 2021 JogjaFile Inc.
One possibility to generate data from a multivariate distribution is to adopt a copula framework: https://en.wikipedia.org/wiki/Copula_(probability_theory) A copula is "just" a multivariate distribution on $[0,1]^d$ with uniform margins.
By Sklar's theorem the joint distribution $H$ can then be written as \begin{align} H(x_1,\ldots,x_d) = C(F_1(x_1),\ldots,F_d(x_d)), \end{align} where $C$ is the copula and $F_1,\ldots,F_d$ are the marginal distributions of $H$. On the other hand, any combination of univariate distributions and a copula in the above fashion yields a valid joint distribution $H$.
If you now have a pseudo-observation from $(U_1,\ldots,U_d)$ from $C$, a sample from $H$ can be generated as $(F_1^{-1}(U_1),\ldots,F_d^{-1}(U_d))$. Here you see the similarity to the univariate case. However, the dependence between $U_1,\ldots,U_d$ will now carry over to $(F_1^{-1}(U_1),\ldots,F_d^{-1}(U_d))$.
At this point we have reduced the general problem of generating pseudo-observations from $H$ to that of generating pseudo-observations from $C$, so you could say we didn't gain much.
However, once you can sample from $C$ you can freely choose the margins (!) meaning that you now have access to a whole family of distributions.
How to actually sample from a copula is another topic and you can find details in books (and papers) on copulas.
In R there (luckily) is the copula package (https://cran.r-project.org/web/packages/copula/index.html) that has simulation algorithms for most popular copula families.