Let $T\colon X\to X$ be a measure preserving transformation on a probability space $(X,\mu)$. We say that $T$ is ergodic if $A$ is measurable and $T^{-1}(A)=A$ implies that $\mu(A)\in\{0,1\}$. I have seen that the following are equivalent:
- $T$ is ergodic,
- If $B$ is measurable and $\mu(T^{-1}(B)\Delta B)=0$, then $\mu(B)\in\{0,1\}$. (Here $\Delta$ stands for the symmetric difference.)
However, I was reasoning as follows. If $C$ is measurable and $C\subset T^{-1}(C)$, then $$\mu(T^{-1}(C)\Delta C)=\mu(T^{-1}(C)\setminus C)+\mu(C\setminus T^{-1}(C))=\mu(T^{-1}(C)\setminus C)\\ =\mu(T^{-1}(C))-\mu(C)=\mu(C)-\mu(C)=0.$$ So I concluded that there was another characterization of ergodicity, namely:
- If $C$ is measurable and $C\subset T^{-1}(C)$, then $\mu(C)\in\{0,1\}$.
Note that it is clearly indeed also true that 3 implies 1. But I find this characterization really counterintuitive, especially since I do not encounter this definition in the literature.
I think it is a nice characterization for the following reason: Suppose $T$ is ergodic and we want to show that a measurable set $C$ has either measure $0$ or $1$. Then, by 3, it suffices to prove that $C\subset T^{-1}(C)$. The reverse inclusion does not even have to hold!
So my questions are: Is my reasoning correct, that is, is $3$ indeed equivalent to ergodicity? Why does most literature (atleast the literature I found) choose not to mention anything about 3?
The notion (3) of ergodicity in the OP can be seen in terms of Markov chain terminology (that of absorbent sets or states). Suppose $(X,\mathscr{F})$ is a measurable space. Recall that a Markov transition probability function $P$ is a function $P:X\times \mathscr{F}\rightarrow[0,1]$ such that:
Notation:
Definitions
Comments:
The case of the OP can be identified with the Markov transition function $P$ defined as $P\mathbb{1}_A=\mathbb{1}_A\circ T$, that is $$P(x,A)=\mathbb{1}_A(T(x))=\mathbb{1}_{T^{-1}(A)}(x)$$
In this slightly more general setting,
It is not difficult to show that the standard definition of ergodicity for $\mu$-invariant transformation $T$ is equivalent to (iii) (regardless of whether $\mu$ is finite or not).
The notion of ergodicity that that the OP discovered is equivalent to stating that $T$-absorbent set has either measure $0$ or full measure.
in the setting of Markov chains described above it is not difficult to show that
Remark: The equivalence (c) in the slightly more general setting of Markov chains is what the OP discovered in the case of invariant maps.
There are several treatments of the Ergodic theory for Markov chains: