How to mathematically prove that we are sampling from same distributions?

848 Views Asked by At

The content of this question is about rigorously proving something which is otherwise considered easily correct intuitively.

Let's assume we have a multivariate distribution $g(x_1,x_2,...,x_n)$ over the variables $x_{1:n}$. Let's assume that we know how to sample from that distribution, too. We draw samples $x^{1}_{1:n},x^{2}_{1:n}, ... ,x^{N}_{1:n}$ from this distribution.

Then we assume that we have the distributions $f_{1}(x_1), f_{2}(x_2|x_1),f_{3}(x_3|x_2,x_1),...,f_n(x_n|x_{1:n})$ which is equal to $g(x_1,x_2,...,x_n) = f_{1}(x_1)f_{2}(x_2|x_1)f_{3}(x_3|x_2,x_1),...,f_n(x_n|x_{1:n})$ by the chain rule of probabilities. Now, for each sample $x^{i}_{1:n}$ we first sample $x^{i}_{1}$ from $f_{1}(x_1)$, then $x^{i}_{2}$ from $f_{2}(x_2|x_1)$ up to $x^{i}_{N}$ from $f_n(x_n|x_{1:n})$. We again obtain $N$ samples.

Intuitively we know that the first $N$ samples which come from $g(x_1,x_2,...,x_n)$ and second $N$ samples each of which sequentially come from $f_{1}(x_1),f_{2}(x_2|x_1),f_{3}(x_3|x_2,x_1),...,f_n(x_n|x_{1:n})$ are identically distributed. But how can we show this fact in a mathematically rigorous way? I could not think of any procedure and became stuck.

Thanks in advance.

1

There are 1 best solutions below

5
On BEST ANSWER

From the fact that you are equating the two distribution functions, you have, in effect, proved that the samples come from the same distribution. The probability that a random vector is in a measurable set is identical in each case.

As a practical matter, the question arises whether numerically generated samples, for example in Monte Carlo simulation, are from the same distribution. That can be answered only in terms of a statistical confidence level. In other words, if you generated two samples using algorithms based on the different representations of the distributions you propose, then you can accept or reject a hypothesis that they come from the same distribution to some degree of confidence using any of a number of statistical tests. The Kolmogorov-Smirnov test is an example.

This line of thought leads us to the question of how one numerically generates a sample from a specified distribution. As it turns out the distribution specified by the chain of conditional probabilities is what is used.

For example, suppose we consider a multivariate normal distribution. In step one we generate a sequence of independent uniformly distributed numbers $u_1, u_2, u_3, ...$ using a random-number generator. We then transform these into a sequence from the desired distribution by considering the inverse of each conditional distribution. In other words, the first sample is the inverse marginal distribution evaluated at the first uniformly distributed number:

$$ x_1 = f_1^{-1}(u_1)$$

The second sample $x_1$ is the transformation of the second uniformly distributed number using an inverse of the conditional distribution:

$$ x_2 = f_2^{-1}(u_1|x_2).$$

And so on ...

To understand this better, consider how the Cholesky decompisition when sampling from a bivariate joint normal distribution, $f(x_1,x_2)$, where the correlation between the random components is $\rho$. First we generate two uniform random numbers $u_1$ and $u_2$. Then we apply the inverse of the marginal distribution -- a univariate normal distribution which turns out to be $f_1$ as described above. This transforms the independent uniform random numbers into independent normal random numbers:

$$z_1 = f_1^{-1}(u_1), \ \ \ z_2 = f_1^{-1}(u_2). $$

Finally we map the independent normal random numbers into normal random numbers with the approriate correlation:

$$ x_1 = z_1, \ \ \ x_2 = \rho z_1 + \sqrt{1 - \rho^2}z_2 . $$

Notice that the generation of $x_2$ depends on $x_1$. Hence, in practice, when we set out to sample from $f(x_1,x_2)$ and impose the desired correlation structure, we are first sampling from $f_1(x_1)$ and then from $f_2(x_2|x_1)$.