I have a few questions regarding measure-preserving dynamical systems $(X,\mathcal{A},\mu,T)$.
1) The definition of measure preserving is always stated as $$\mu(T^{-1}B)=\mu(B),$$ for all $B$.
I believe this neither implies nor is implied by the similar statement $$\mu(TB)=\mu(B).$$
Is this correct? The problem lies in the fact that $TT^{-1}B \subseteq B \subseteq T^{-1}TB$ may be strict inclusions, so the following implications may not hold: $$\mu(T^{-1} TB) = \mu(TB) \implies \mu(B)=\mu(TB)$$ and $$\mu(TT^{-1}B)= \mu(T^{-1}B) \implies \mu(B)=\mu(T^{-1}B).$$
2) Is there some "reason" why the above definition involves $T^{-1}$ rather than $T$?
3) The definition of strong mixing is $$\lim_{n \to \infty} \mu(A \cap T^{-n}B) = \mu(A) \mu(B)$$ for all $A$ and $B$. Again, the exponent of $T$ is negative. However, in the wine-water example below the definition, they have $T^n$. Is this an issue?
Thank you in advance
$1$) This is correct, there are examples of transformations which change the measure going forward.
$2$) Inverses commute with basic set operations like complement, union, and intersection, so it is handy when studying things to be able to do that. This is the foremost reason for making the definition as it is invaluable. For bijections, there is no need to distinguish, of course, and this is the most common case, but the theorem on continuous homomorphisms from the example above is another great example of why we choose inverses.
$3$) Because $T$ is measure preserving and bijective (you're not losing any molecules in this particular system), $\mu(T^nA\cap B)=\mu(A\cap T^{-n}B)$ which is an equivalent formulation since
$$T^{-n}(T^nA\cap B)=A\cap T^{-n}B.$$
More generally this is not so, and indeed if you recall the definition of a measurable function is that it sends measurable sets to measurable sets under pre-images, so the image may not even be in the relevant $\sigma$-algebra. (h/t m.g. for making the implicit explicit)