Conditional probability: Multinomial Partitioning

49 Views Asked by At

Say we start of with a certain number of items $N$ of which $M_b$ are labelled $M$ and $N-M_b$ are labelled A. We make 2 moves: 1) Take this two-state system, and remove $m$ items of type $M$ and $a$ items of type A from the total $M_b$ and $N-M_b$ sets. 2) Then, the vacancies are randomly filled with either type $M$ and $A$ items, and we get to a final state where I have $M_a$ items labelled with $M$. This process results in the distribution of number of M-type balls $M_a$:

$$D (M_a, M_b) = \sum_m \sum_a {M_b\choose m} {N-M_b\choose a} {N-m-a\choose M_a-m} \frac{1}{2^{N-m-a}}$$

where I am summing over the number of ways of getting from state $M_b$ to state $M_a$ by choosing $m \leq \min (M_a, M_b)$ and $a \leq \min (N-M_a, N-M_b)$

My rationalisation is as follows, ${M_b\choose m} $ and ${N-M_b\choose a}$ are the number of ways I can choose the items to be transferred in step one. Then, ${N-m-a\choose M_a-m} \frac{1}{2^{N-m-a}}$ term is the probability of populating the $N-m-a$ vacancies such that I end up with $M_a$ balls of type $M$.

However, I seem to be slightly off. The correct number distribution is

$$D (M_a, M_b) = \sum_m \sum_a {M_b\choose m} {N-M_b\choose a} {N-m-a\choose M_a-m} \frac{1}{2^{2N-m-a}}$$

A little background: I am trying to describe how nucleosomes are distributed during cell division. Namely I am trying to work through this paper and I am trying to derive equation (12) of this text.

Epigenetic states are capable of being inherited across cell divisions. This can give difficulties for stability of the states [4], particularly for 2-state systems [13]. At cell division the genome is duplicated, and following [4, 15] we assume that the resident nucleosomes are partitioned randomly between the daughter strands. The vacant positions are filled by new randomly selected nucleosomes where half are in the M-state and half in the A-state. We accordingly supplement our model above with cell divisions at certain fixed time intervals. This cell generation time is measured in units of the number of attempted nucleosome updates per nucleosome. (...) Consider that before cell division the system is in a state with Mb = mb × N nucleosomes in the M-state and the remaining nucleosomes in the A-state. Cell division results in the distribution of number of M-state nucleosomes $M_a$: $$D (M_a, M_b) = \sum_m \sum_a {M_b\choose m} {N-M_b\choose a} {N-m-a\choose M_a-m} \frac{1}{2^{2N-m-a}}$$ where the sum runs over all the ways of getting from Mb to Ma by selecting $m \leq \min (Ma, Mb)$ nucleosomes in the M-state and $a \leq \min (N − Ma, N − Mb)$ nucleosomes in the A-state to be transferred directly at the cell division.