We say that a measure-preserving transformation $T: X\to X$ of a probability space $(X, \mathscr{B}, \mu)$ is weak mixing if, for all $A, B\in \mathscr{B}$, $$\lim_{n\to \infty} \frac{1}{n} \sum_{j=0}^{n-1} |\mu(T^{-j}A \cap B)-\mu(A)\mu(B)|=0$$ and strong mixing if, for all $A, B\in \mathscr{B}$, $$\lim_{n\to \infty} \mu(T^{-n} A\cap B)=\mu(A)\mu(B).$$
In my notes, it simply states that strong mixing clearly implies weak mixing, however, I can't show this. How may I prove this obvious result?
This has nothing to do with ergodic theory.
Exercise: If $(a_n)_n$ is a sequence of reals converging to some $a \in \mathbb{R}$, then $\lim_{N \to \infty} \frac{1}{N}\sum_{n=1}^{N} |a_n-a| = 0$.