Generating iid samples of mixture of distributions

214 Views Asked by At

Let $D_1$ and $D_2$ be distributions. Assume that: - one can always get finite amount of iid samples from $D_1$ - one can always get finite amount of iid samples from $D_2$ Let also $D=\frac12 D_1 + \frac12 D_2$ and $m$ be an integer. How can we get at least $m$ i.i.d. samples from $D$?

Is it true that we can just pick $m/2$ iid samples from $D_1$ and $m/2$ iid samples from $D_2$?

2

There are 2 best solutions below

2
On BEST ANSWER

Let $X_1,\dots,X_m,Y_1,\dots,Y_m, B_1,\dots,B_m$ be independent with $X_i$ distributed according $D_1$, $Y_i$ distributed according $D_2$ and $B_i$ with Bernoulli distribution with parameter $0.5$.

If $Z_i:=B_iX_i+(1-B_i)Y_i$ then $Z_1,Z_2,\dots,Z_m$ are iid and distributed according $D$.

This because: $$P(Z_1\in W)=P(Z_1\in W\mid B_1=1)P(B_1=1)+P(Z_1\in W\mid B_1=0)P(B_1=0)=$$$$P(X_1\in W)0.5+P(Y_1\in W)0.5=D_1(W)0.5+D_2(W)0.5=D(W)$$

0
On

No, taking $m/2$ samples from both is not enough. You have a mixture: the "standard" process is, for each sample, to

  • toss a fair coin (get a realization of a Bernoulli $1/2$);
  • if it is heads draw from $D_1$; otherwise, draw from $D_2$;
  • repeat $m$ times independently.

It is clear than in some cases, one could have even $m$ samples from $D_1$ (and $0$ from $D_2$), for instance! This happens with very small (in that specific case, exponentially small: $1/2^m$) probability, but can happen. And in most of the cases, you'll have a bit more than $m/2$ samples from one distribution, and a bit less than $m/2$ from the other -- roughly, $\frac{m}{2}\pm\sqrt{m}$.

To be certain you can generate your $m$ samples from $D$ with probability $1$, you therefore have to get originally $m$ i.i.d. samples from $D_1$ and $m$ i.i.d samples from $D_2$, and do the above.

You basically get twice as many samples than necessary and end up wasting some "unused" ones, but that's a small price to pay.