Let $(X,\mathscr{A},\mu)$ be a measurable space and $T$ measure preserving map and define:
$T$ is mixing iff $\lim_{n\to\infty}\mu(A\cap T^{-n}B)=\mu(A)\mu(B)$ for all $A,B\in\mathscr{A}$.
How to prove that mixing property is equivalent with:
$\lim_{n\to\infty}\langle U_T^n f,g\rangle=\langle f,1\rangle\langle 1,g\rangle$ for all $f,g$ in dense subset of $L^2(\mu)$?
Here, $(U_Tf)(x)=f(Tx)$.
One implication is direct (take characteristic functions for $f$ and $g$), but I'm having trouble to prove the second one.
Any help or hint is welcome. Thanks in advance.
We denote the two statements we are interested in as follows,
As stated by the OP, the implication (Alternate Mixing)$\implies$(Mixing) follows easily by considering indicator functions.
For the other implication, we will need the following lemma that follows from Lemma $3.13$ of Rudin's Real and Complex Analysis, $3$rd Ed, p. $69$.
(Mixing)$\implies$(Alternate Mixing)
Assume that the statement of the mixing definition holds true. Take any $r,t\in S$. We define, $$ r(x)=\sum^n_{i=1}a_i\mathbf{1}_{A_i}(x)\qquad t(x)=\sum^m_{j=1}b_j\mathbf{1}_{B_j}(x)$$ Where $\bigcup_{i=1}^n{A_i}\subseteq X$ and $\bigcup_{j=1}^m{B_j}\subseteq X$ and also $\{a_i\}_{i=1}^n\cup\{b_j\}_{j=1}^m\subseteq \mathbb{R}$.
Consider then, $$\langle r(T^n),t\rangle=\int_X \left(\sum_{i}a_i\mathbf{1}_{T^{-n}A_i}\right) \left(\sum_{j}b_j\mathbf{1}_{B_j}\right)d\mu$$ $$= \int_X \sum_{i,j}a_ib_j\ \mathbf{1}_{T^{-n}A_i\cap B_j}\ d\mu.$$ By the linearity of the integral, $$\langle r(T^n),t\rangle=\sum_{i,j}a_ib_j \int_X\mathbf{1}_{T^{-n}A_i\cap B_j}\ d\mu=\sum_{i,j}a_ib_j\ \mu(T^{-n}A_i\cap B_j).$$ Therefore, $$\lim_{n\rightarrow\infty}\langle r(T^n),t\rangle=\lim_{n\rightarrow\infty}\sum_{i,j}a_ib_j\ \mu(T^{-n}A_i\cap B_j)=\sum_{i,j}a_ib_j\ \lim_{n\rightarrow\infty}\mu(T^{-n}A_i\cap B_j).$$ By our assumption, we have that, $$\lim_{n\to\infty}\langle U_T^n f,g\rangle=\lim_{n\rightarrow\infty}\langle r(T^n),t\rangle=\sum_{i,j}a_ib_j\mu(A_i)\mu(B_j)=\left(\sum_{i}a_i\mu(A_i)\right)\left(\sum_{j}b_j\mu(B_j)\right)$$ $$=\mathbb{E}(r)\mathbb{E}(t)=\langle r,1\rangle\langle 1,t\rangle.$$
And the required result follows.
We can use the result we have just proven to prove that the alternate mixing definition holds true on all of $L^2(\mu)$.
This will follow from the fact that any $f,g\in L^2(\mu)$ can be approximated arbitrarily well by two sequences of simple measurable functions in $S$. Using these sequences, the above argument can be modified to prove the general result.