I have a directed graph with out-degree 2, such as the following:
Every nodes is colored in one of two colors. A random walk (taking any one of the out-edges with the same probability) on this graph thus yields a stochastic process $(X_i)$ with values in $\{\text{blue},\text{red}\}$.
What is the entropy rate $$ H(X) := \lim_{n\to\infty} \frac{H(X_1,\ldots,X_n)}{n}, $$ of this process?
(Just for completeness, $$ H(X_1,\ldots,X_n) = - \sum_{x_1,\ldots,x_n} P(X_1 = x_1,\dots,X_n = x_n) \cdot \log P(X_1 = x_1,\dots,X_n = x_n) $$ is the the usual entropy of a random variable.)

Let me change a bit the notation. Let $X_i$ be the label of the node (or state) visited at time $i$, and $Y_i=g(X_i)$ its colour. Let $X_{[n]}=(X_1, X_2 \cdots, X_n)$
Then we have
$$ H(Y_{[n]})=H(X_{[n]})- H(X_{[n]}\mid Y_{[n]}) \tag{1}$$
Also, because $X_i$ is a Markov chain, and assuming we start from a random node $$H(X_{[n]}) = \log_2G + (n-1)\log_2 \alpha$$
where $G$ is the size of the graph and $\alpha$ is the (assumed constant) outdegree . (In our case $\alpha=2$)
$H(X_{[n]}\mid Y_{[n]})$, relates to recovering the state path given the colour sequence. Dividing $(1)$ by $n$ we define the conditinal entropy rate or average information loss:
$$H_r(X \mid Y) = \frac{H(X_{[n]}\mid Y_{[n]})}{n} = H_r(X)-H_r(Y)$$
If given the colours sequence we can recover the state sequence (preimage), then $H_r(X \mid Y)=0$. This also holds even if there are several preimages but the amount is $o(n)$.
If that holds ("entropy rate preserved"), then, in our case $H_r(Y)=H_r(X)=1$ bit.
In this paper this problem is studied. The key structural property of the graph is the split-merge index $K$ defined as the mininum $n$ such that there exists two different state sequences (with non-zero probability), starting and ending at the same states, of length $n$ (not counting the start and ending nodes), with the same "colour" ($Y_{[n]}$) sequence.
One result is that $H_r(X \mid Y)=0 \iff K = \infty$
This is clearly not the case in the despicted graph ($K=1$, see the node at the bottom).
This is just a starting point. We are left with
How to compute $K$ for a general graph: this is discussed in the paper. It is algoritmically doable because of the property $K < \infty \implies K< G^2$
How to relate $K$ with $H_r(X \mid Y)$. This is in general difficult, one at most can expect some bounds. See sec 3.4 of the paper.
Does the particular conditions here ($|{\mathcal Y|=2}$ and $\alpha=2$) help to simplify the calculation? I have no idea.