Probability distributions of beads on chains after choosing a subset, and cutting them

55 Views Asked by At

$$A = {M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k} {I_{m+k}}$$

is the abundance of all fragments of length $M$ with $m$ black beads and $M-m$ white beads of all different sequences of chains of length $N$ containing between $m$ to $n$ black beads, and where $I_{m+k}$ is the abundance of one specific chain with $m+k$ black beads. Now, let me elaborate and pose the question: "What if we introduced beads that count as two black beads?":

Imagine a setup where you have bags containing a near infinite (for our purposes) number of chains with an equal number of beads $N$. Whenever we take a look at the contents in a bag, we sort the chains by their weight. There are a certain number of different beads, and their weights are incremented by 1 unit. That is, we may have types of beads of weight w, w+1, w+2, w+2, etc. There is a certain abundance $\sigma$ of each bead type which is the same in all conceivable bags, and these beads are randomly distributed on the chains.

Let's look at a scenario where we have 3 different kinds of beads (white, black and blue) and a bag with chains of 4 beads. These are the different chains that get sorted into each category (by their weight):

w = white - has a mass of w
b = black - has a mass of w+1
B = blue - has a mass of w+2

W = 4*w

W+0: wwww
W+1: bwww, wbww, wwbw, wwwb
W+2: bbww, bwbw, bwwb, wbbw, wbwb, wwbb, Bwww, wBww, wwBw, wwwB
W+3: bbbw, bbwb, bwbb, wbbb, Bbww, Bwbw, Bwwb, bBww, wBbw, wBwb, bwBw, wbBw, wwBb, bwwB, wbwB, wwbB

Note that the chains have direction, meaning that "wwb" is not identical to "bww".

Each weight category is more or less abundant, solely dependent on the abundance of the different beads. The relative abundances amount to 1 if summed:$$\sum_{k=0}^n=A_0 + A_1 + A_2 + A_3 + A_5 + ...+A_n = 1$$

Consider the following scenario:

We choose only a subset of these weight categories, always including the lightest category of chains and all categories up to a category of a certain mass. The heaviest included category of chains are $n$ units heavier than the lightest included category. Consider $n=2$ in the example above, our subset of sequences would be:

W+0: wwww
W+1: bwww, wbww, wwbw, wwwb
W+2: bbww, bwbw, bwwb, wbbw, wbwb, wwbb, Bwww, wBww, wwBw, wwwB

After we choose a subset, all chains are cut at random positions. The right parts of the chain are discarded and the left parts are again put into different bags by their length $M$. These cut chains, are referred to as fragments.

My question is: Considering one of these bags with fragments of length $M$ from chains of length $N$ after isolations of weight categories $0$ to $n$, what are the abundances of these fragments' weight categories?

I believe I have, after getting a lot of help made a formula for the abundances in the case where we only have two types of beads:

Let's say that white beads have a mass of w, and black beads a mass of w+1.

Considering a fragment of length $M$ and one of its categories of $m$ additional weight units compared to the lightest category (in this case $m$ is also the number of black beads in the sequence), a possible parent chain can be formed by adding a sequence of $N-M$ beads to the given fragment with no more than $n-m$ black beads. The number of such parent sequences is: $$\sum_{k = 0}^{n - m} {N - M \choose k}$$

Furthermore, the number of fragments with $M$ beads and $m$ black beads is ${M \choose m}$. Thus the number of parent-fragment pairs is thus: $${M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k}$$

Each fragment could be said to have a number of donating parent categories; that is, parent chain categories that may produce fragments with $m$ black beads. A fragment with $m$ black beads may of course only have parent chains with at least $m$ beads. Each one of the possible parent chains of parent category $l$ ($l$ black beads) have an abundance of $$I_l={\sigma_{w}}^{(N-l)}{\sigma_b}^l$$

where $\sigma_w$ and $\sigma_b$ are the abundances of white and black beads respectively. In $\sum_{k = 0}^{n - m} {N - M \choose k}$, $k$ ranges between donating parent categories, the first being $m$, thus $k+m$ is the donating parent chain category with ${N - M \choose k}$ different parent chain sequences. The abundance of a fragment category is thus:

$$A = {M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k} {I_{m+k}}$$


  1. Does this seem right to you?
  2. In this example, we had two types of beads. How would the formula change if we had three types of masses w, w+1 and w+2?
  3. How would the formula change to answer the general question with an unknown amount of bead types?

Please don't be afraid to criticize or ask questions.

Thanks!

Max