Distribution of lengths of "chains" of repeated outcomes in a binomial process

25 Views Asked by At

Setup:

Suppose I am interested in a simple binomial random process with two possible outcomes: $A$ with probability $p$, and $B$ with probability $1-p$

Each event of the random process is independent.

Problem:

For a given number of trials $N$, define the set of lengths $l$ as the lengths of repeated segments of $A$ outcomes. From this, I can define a distribution $P(l)$ that a given length takes the value $l$.

An example with 13 events:

$AAABABAABBAAA$ has four lengths $l$: 3, 1, 2, 3.

Hence $P(l=3) = 1/2$, $P(l=1)=1/4$ and $P(l=2) = 1/4$.

The question is simple: what is this (normalised) distribution as $N\to\infty$?

Why I'm stuck:

Though the probability of having $l$ events in a row all be $A$ is trivial to compute ($p^l$), computing the probability that a given `chain' of $A$ events is of length $l$ is confusing me! Naively I would say that a chain of length $l$ has weight $p^l$, and hence $P(l) = p^l/\sum_lp^l$ - but I am worried I might be thinking too simplistically...