I'm reading up on markov chains and came across the following sentence on the wikipedia article on Markov information source
A unifilar Markov source is a Markov source for which the values ${\displaystyle f(s_{k})}$ are distinct whenever each of the states ${\displaystyle s_{k}} $ are reachable, in one step, from a common prior state
Could someone elaborate on this?(the article is rather brief)
Let's suppose that our Markov chain $M$ is a random walk on a path of length $6$: the states are integers $\{0,1,2,3,4,5,6\}$, and at every step we go from state $i$ to state $i+1$ or $i-1$ with equal probability (except at $0$ and $6$, where we go to $1$ and $5$ respectively with probability $1$).
Here are two possible Markov information sources we can define on this Markov chain (our alphabet will be the English alphabet):
So if our random walk starts at state $0$ and happens to go $0, 1, 2, 3, 4, 5, 4, 3, 2, 3, 4, 3, \dots$ then the Markov information source $(M,f)$ will produce a sequence beginning $\textsf{GEORGIGRORGR}\dots$ and the Markov information source $(M,g)$ will produce a sequence beginning $\textsf{INDIANAIDIAI}\dots$.
In this example, $(M,f)$ is unifilar: the neighbors of each state are assigned distinct letters. (It's okay that $f(0) = f(4) = \textsf{G}$, because there's no state that can transition to both $0$ and $4$.) However, $(M,g)$ is not unifilar: from state $5$ we can go to states $4$ and $6$, but $g(4) = g(6) = \textsf{A}$.
Roughly speaking, here is why this matters. With a unifilar information source such as $(M,f)$, if I tell you the starting state $X_0$ and then the sequence of letters produced up to some step $n$, you will be able to uniquely determine the intermediate states $X_1, \dots, X_{n-1}$ that the Markov chain visited and, most importantly, the state $X_n$ that the Markov chain ends up at. You can do this because from each state, the different transitions you can take produce different letters as output. As a result, it's very easy to compute the distribution of the next letter in the sequence.
If the Markov information source is not unifilar, then knowing the starting state and the letters produced does not tell you what states the Markov chain visited. For example, with $(M,f)$, the sequence of states $0,1,2,3,4,5,6$ and $0,1,2,3,4,5,4$ produce the same sequence of letters $\textsf{INDIANA}$. We can still compute a distribution for the next letter, but it requires averaging: the final state is equally likely to be $4$ or $6$, so the next letter is $\textsf{I}$ with probability $\frac12 \cdot \frac12 + \frac12 \cdot 0$ and $\textsf{N}$ with probability $\frac12 \cdot \frac12 + \frac12 \cdot 1$.
In general, this averaging can get arbitrarily complicated. Maybe at some point we're equally likely to be in states $s_1$ and $s_2$; from $s_1$, the next states produce letters $\textsf{X}$ and $\textsf{Y}$ with equal probability, and from $s_2$, the next states are twice as likely to produce $\textsf{X}$ as $\textsf{Y}$. Then the correct thing to do upon seeing an $\textsf{X}$ as the next letter is to do a Bayesian update and conclude that with probability $\frac23$, we were actually at $s_2$ and not $s_1$. This, of course, affects the distribution of the next letter...