Let
- $(E,\mathcal E,\lambda)$ be a measure space
- $k\in\mathbb N$
- $q_i:E\to[0,\infty)$ be $\mathcal E$-measurable with $$\int q_i\:{\rm d}\lambda=1$$ and $\nu_i:=q_i\lambda$ for $i\in\{1,\ldots,k\}$
- $\alpha_1,\ldots,\alpha_k\ge0$ with $$\sum_{i=1}^k\alpha_i=1\tag1$$ and $$\nu:=\underbrace{\sum_{i=1}^k\alpha_iq_i}_{=:\:q}\lambda$$
- $p:E\to[0,\infty)$ be $\mathcal E$-measurable with $$\int p\:{\rm d}\lambda=1$$ and $\mu:=p\lambda$
- $(\Omega,\mathcal A,\operatorname P)$ be a probability space
- $n\in\mathbb N$ and $X_1,\ldots,X_n$ be independent $(E,\mathcal E)$-valued random variables on $(\Omega,\mathcal A,\operatorname P)$ with $X_1,\ldots,X_n\sim\nu$
In multiple importance sampling we would take $(E,\mathcal E)$-valued random variables $Y_{ij}$ on $(\Omega,\mathcal A,\operatorname P)$ with $$Y_{ij}\sim\nu_j$$ for $i\in\{1,\ldots,n_j\}$ and $j\in\{1,\ldots,k\}$ such that all $Y_{ij}$ are mutually independent.
Now I've read that the samples $Y_{ij}$ can be treat as they were drawn from the mixture distribution $\nu$ with $\alpha_i=n_i/n$.
What exactly does that mean?
Your question has become quite different from its original form and the question now has (a priori) nothing to do with importance sampling anymore, so I will only explain it without the importance sampling aspect as if you were just trying to approximate some expected value $$ \mathbb E_q[f] = \int fq\, \mathrm d\lambda. $$ The idea is that if you draw 5 independent samples from $q_1$ and 3 independent samples from $q_2$, then intuitively you have drawn 8 samples from $$ q = \frac{5}{8}q_1 + \frac{3}{8}q_2. $$ However, these samples can not be treated as being i.i.d from $q$! The easiest way to see the dependence, imagine $q = (q_1+q_2)/2$ and $q_1$ and $q_2$ have disjoint (think of far apart) support. If you draw one sample from $q_1$ and one from $q_2$, they will definitely be far apart, so they are dependent if interpreted as samples from $q$! (These considerations were on a rather intuitive basis and not rigorous.)
If you want to sample independently from the above $q$, you can proceed as follows:
This gives you truly independent samples from $q$. However, the approach described in the beginning, which is called "stratified sampling", still seems reasonable for Monte Carlo, and in fact the corresponding MC estimator can be proven to be never worse (in terms of variance) then if you draw independently from $q$, which is called "composition sampling".
Both approaches are described in the book "Rubinstein, Kroese - Simulation and the Monte Carlo Method" in Chapters 2.3.3 and 5.5, respectively, and the theoretical result is given in Proposition 5.5.1.
All of this can, of course, be applied using $q$ as an importance sampling density and then one has to distinguish between mixture importance sampling and multiple importance sampling, but this is no longer the question, right?