Suppose the following procedure:
- We assemble an infinite number of strings of arbitrary length using the alphabet [a,b,c,d,i,j,x,y,z]. Characters are chosen at random, repetition is allowed.
- We divide the alphabet into 3 sub-alphabets a1=[a,b,c,d], a2=[i,j] and a3=[x,y,z]. We remove all generated strings that do not contain at least one character from each alphabet - "jaz" is valid, "bazz" is not.
- We traverse an arbitrary generated string from left to right, knowing its length in advance. We wish to calculate the probability of which alphabet the next character in the sequence will belong to. For instance, we have traversed the first 4 characters of a length 5 string. The first 4 characters are "ciia". Only alphabet 3 has yet to be used and there is only character remaining, hence the probability of the next character belonging to a3 is 100%.
What is the general formula for calculating the probability of each alphabet being used at each step?
Let $W_n$ be the set of words of length $n$ that use at least one letter from each of the three sub-alphabets. I assume that, once you are told the length of the word is $n$, then any of the words in $W_n$ are equally likely.
Assume that you have currently observed some prefix $p$ of the word, where the length of $p$ is at most $n-1$, and you want to determine the probability distribution of the next letter. Let $W_n(p)$ denote the set of words in $W_n$ which have $p$ has a prefix. For each letter $\ell$ in the alphabet, we can determine the probability of seeing $\ell$ next as follows: $$ P(\text{next letter is $\ell$}\mid \text{seen prefix of $p$})=\frac{|W_n(p+\ell)|}{|W_n(p)|} $$ Here, $p+\ell$ is the result of appending $\ell$ to $p$. Therefore, in order to compute this expression, we just need a way to compute $|W_n(p)|$ for arbitrary prefixes, $p$. This is just a basic application of the conditional probability formula, $P(A\mid B)=P(A\cap B)/P(B)$. To compute the probabilities in the numerator and denominator, use the formula probability = (# favorable cases)/(# total cases), which is valid because I assumed at the beginning that all words in $W_n$ are equally likely.
We can compute $|W_n(p)|$ using the principle of inclusion-exclusion, as follows:
If $p$ contains one letter from each alphabet, then $|W_n(p)|=9^{n-|p|}$, where $|p|$ denotes the length of $p$. This is because there are $n-|p|$ letters remaining, and $9$ choices for each.
If $p$ is missing letters from the first alphabet, $A_1$, but $p$ contains letters from $A_2$ and $A_3$, then $|W_n(p)|=9^{n-|p|}-5^{n-|p|}$. There are $9^{n-|p|}$ to fill out the rest of the word with letters from the three alphabets, but we need to subtract the $5^{9-|p|}$ ways which only use letters from $A_2$ and $A_3$.
Similarly, if $p$ contains letter from $A_1$ and $A_3$, but not $A_2$, then $|W_n(p)|=9^{n-|p|}-7^{n-|p|}$. If $p$ contains letters from $A_1$ and $A_2$, but not from $A_3$, then $|W_n(p)|=9^{n-|p|}-6^{n-|p|}$.
If $p$ contains only letters from $A_1$, then $|W_n(p)|=9^{n-|p|}-7^{n-|p|}-6^{n-|p|}+4^{n-|p|}$.
If $p$ contains only letters from $A_2$, then $|W_n(p)|=9^{n-|p|}-5^{n-|p|}-6^{n-|p|}+2^{n-|p|}$.
If $p$ contains only letters from $A_3$, then $|W_n(p)|=9^{n-|p|}-7^{n-|p|}-5^{n-|p|}+3^{n-|p|}$.
Finally, if $p$ is the empty prefix (which occurs when you are predicting the first letter of the word), so $|p|=0$, then $$ |W_n(p)|=|W_n|=9^n-7^n-6^n-5^n+4^n+3^n+2^n-0^n $$