How to sample from joint Bernoulli distribution?

339 Views Asked by At

I have done searches on google.com and here but haven't found a related one. I apologize if this is a re-post.

Here is my question:

I have a set $n$ of Bernoulli random numbers, and I know the marginal distributions $p_i$ and pairwise joint distributions $p_{ij}$, I would like to generate $m$ i.i.d. samples from this random numbers. How should I do this? Any input is appreciated.

Maybe I should have asked this question in the first place. In practice, when talking about correlation, is the pair-wise joint distribution or pair-wise correlation more often seen?

Thanks!

1

There are 1 best solutions below

6
On

First, knowing the pairwise distributions $p_{ij}$already gives you $p_i$. On the other hand, that data is not enough to determine the full join distribution.

If you only are interested in generating a Bernoulli that fit those $p_{ij}$, withouth regard for higher distributions, then simply assume a Markov process, and generate $x_1$ according to $p_1$, then $x_2$ according to $p_{2|1}=p_{12}/p_{1}$

Regarding you last question: no special preference. You can either give $p_1$ $p_2$ and $\rho_{12}$ or either $p_{12}$. In both cases there are three degrees of freedom, and it's trivial to convert from one representaion to another.


Update: The above only is useful to fit the pair distributions of consecutive elements, ($p_{1,2},p_{2,3},\dots$) but it does not fit other pair correlations. A complete treatment is more complicated.

A multivariate Bernolli is fully (and univocally) specified by the $2^n$ joint probabilities $p_{\bf x}$ restricted to $0 \le p_{\bf x} \le 1$ and $\sum p_{\bf x}=1$. An alternate (and perhaps more convenient) representation is given (refer to this answer) by the $2^n-1$ coefficients $a_{i...k}=E[x_i \cdots x_k]$, where $x_i=\pm1$ and $a_{\emptyset}=1$. These both representations are linearly related through a Hadamard matrix.

In our case, we are given the $n$ first order coefficients $a_i$ and the $n(n-1)/2$ second order coefficients $a_{i,j}$. If we are at liberty to choose the other coefficients, we can try by setting them to zero, and checking that the resulting ${\bf p}={\bf M a}$ falls into the allowed hypercube. Once you have find this (or some other acceptable solution), we have the full joint probability, and we can proceed to generate samples.

For samples generation, given the full joint probability, we can either generate the components in order using $P(x_i|x_{i-1} x_{i-2} \cdots x_1)$ or we can use some Gibss sampling or Metropolis algorithm