Expectation of characters in a string

376 Views Asked by At

What is the expected distance between two ’e’s in a random character stream where ’e’s occur 11% of the time?

2

There are 2 best solutions below

1
On

The probability that a letter is an "e" is $11/100$.

Assume independence, a totally unreasonable assumption for a piece of real text, but a reasonable interpretation of the phrase "random character stream."

Given that the current letter is an "e", or even not given that information, the waiting time until the next $e$ has geometric distribution with parameter $p=11/100$. The mean waiting time is $1/p$, so $100/11$.

For a geometrically distributed random variable $X$, the value of $X$ is the total number of trials until and including the first success. The length of the gap is $1$ less than the waiting time, so has mean $89/11$.

1
On

In a sequence of events where $A$ occurs in each of them independently with probability $p$, the expected distance between two occurrences of $A$ is $\frac{1}{p}$. So the mean distance between two e's is $100/11\sim 9$.