Context :
We perform measurements of a set of parameters $(p_1,p_2,...)$ to retrieve the experimental mean of each parameters and the covariance matrix between them. The means of the parameters are then combined to construct a response $\epsilon = f(p_1,p_2,...)$. The question is, how can we propagate the uncertainty stored in the covariance matrix to the response , knowing the parameters must have positive values ?
My first modelization:
I used a Gaussian vector parametrized by a vector of means and a covariance matrix to modelize my data and been able to draw samples from the multivariate Normal distribution. Each set of set of samples is then used to reconstruct the random responses $\epsilon_s$. The spread of the $\epsilon_s$ quantify the uncertainty coming from the parameters.
This approach, easy to implement is yet quite limited as I obtain negative samples sometimes while those parameters should stay positive.
Detailed context :
I have a dataset consisting of mean values for N measured parameters along with their covariance matrix (stemming from experimental nuclear physics). These parameters are inherently positive. I use this set to calculate $\epsilon$ and use the covariance matrix to propagate the uncertainty derived from the data.
Until now, I have been employing multivariate sampling under the assumption of multivariate Gaussian distributions, allowing me to draw samples from a standard normal "$z$" and correlate them using the transformation "$X = L*z + \mu$". L being a lower triangular decomposition of the covariance matrix, $\mu$ the vector containing the means of each parameter and finally, $X$ a vector of random sampled parameters. From this formula it can easily showed that the first two moments of $X$ are effectively those from the experimental data.
Without much additional information (don't know anything about the marginals but the means and covariances), I'm seeking the most rigorous approach to handle this problem.
Multivariate log-normal distributions seem like a common choice under the "Principle of Maximum Entropy," but there are two issues: firstly, it's a principle rather than a theorem, and secondly, I haven't been able to definitively prove that multivariate log-normal distributions are the most suitable for inherently positive data. Some studies perform a multivariate Gaussian sampling, and then exponantiate the samples ... which, as a non-linear transform, does not conserve the initial experimental correlations.
Could copulas offer a better solution in this case? My exposure to this advanced statistical area was minimal during my studies (limited to a brief exercise on Sklar's theorem), and I haven't applied it in any practical context yet. I just recall "Gaussian copulas" but have no clue if "log-normal" ones exist ... even so aren't copulas formulated to avoid guessing marginals ?
How would you approach this issue most rigorously?
Thank you for your insights.