I am interested in the collision entropy rate of a hidden Markov chain, and I wonder if my way of calculating it is correct and if it has been described before.
Definition
Consider a Markov chain with state space $S$, transition probability $p \colon S \to S \to \mathbb R$, where the state is not direclty observable, but only through the (deterministic) function $\rho \colon S \to \Gamma$ for some $\Gamma$. Assume the Markov chain is aperiodic and irreducible, so that we have a stationary distribution, and we begin in that distribution.
Let $X_n$ the state of the chain at step $n$. Then the collision entropy rate is $$ H = \lim_{n->\infty} \frac 1n H_2(\rho(X_1),\ldots,\rho(X_n)) $$ where $H_2$ is the collision entropy, or second Réniy entropy.
Calculation
I use the following procedure in order to calculate $H$:
I find the stationary distribution of ”still-colliding” states and then calculate the probability of the two Markov chains producing different output in the next step.
More precisely, the product Markov chain has state space $S \times S$ and transition probability $$ p((s_1,s_2),\, (s_1', s_2')) = p(s_1,s_1') \cdot p(s_2,s_2'). $$ I find a probability distribution $P \colon C \to \mathbb R$ on the colliding states $C := \{(s_1,s_2) \in S\times S \mid \rho(s_1) = \rho(s_2)\}$ that is stationary in the sense that if the product Markov chain starts in this distribution, takes a step and ends up in a state in $C$ again, then it is again in this distribution. In other words, it satisfies the equation $$ P(s) = \frac{\displaystyle\sum_{s' \in C}P(s')\cdot p(s',s)}{\displaystyle\sum_{s',s\in C} P(s')\cdot p(s',s)}. $$ for all $s \in C$. The denominator re-normalizes the distribution, as the next state may not be in $C$. I find this distribution as the eigenvector with the largest eigenvalue of the transition matrix of the product Markov chain with all transitions outside of $C$ set to zero. (In contrast to the usual stationary distribution of a Markov chain, the eigenvalue will be smaller than 1.)
The collision probability rate is now simply the probability of this product Markov chain remaining in $C$, and hence the collision entropy rate is: $$ H = -\log \sum_{s',s\in C} P(s')\cdot p(s',s). $$
Question
Intuitively, this seems to be correct, but is it? Also, I would have expected to find this (or some other way) to calculate $H$ in the literature, but failed so far. Did I miss anything? If not: Is this just obvious, or indeed an interesting result?
Yes, it turns out that this approach is correct. A more elaborate treatment can be found in this preprint, which uses slightly different language.