I'm trying to follow a proof in my book of the Poincare Recurrence Theorem, but I have three questions about this proof:
Theorem Let $(X,\Sigma,\mu)$ be a finite measure space, $f:X\to X$ be a measurable map, and $\mu$ be $f$-invariant measure. For each set $A\in\Sigma$, we have $\mu(\{x\in A: f^n(x)\in A$ for infinitely many n $\})=\mu(A)$.
Proof: Let $B=\{x\in A: f^n(x)\in A$ for infinitely many $n$ $\}$. We have \begin{align} B=A\cap\bigcap_{n=1}^{\infty}A_n=A\setminus\bigcup_{n=1}^{\infty}(A\setminus A_n) \end{align} where $A_n=\bigcup_{k=n}^{\infty}f^{-k}(A)$. We note that $A\setminus A_n\subset A_0\setminus A_n=A_0\setminus f^{-n}(A_0)$. Since $f^{-n}(A_0)=A_n\subset A_0$, and the measure $\mu$ is finite, it follows that: \begin{align} 0&\leq\mu(A\setminus A_n)\\ &\leq\mu(A_0\setminus f^{-n}(A_0))\\ &=\mu(A_0)-\mu(f^{-n}(A_0))\\ &=0 \end{align}
(because $\mu$ is $f$-invariant). It follows that $\mu(B)=\mu(A)$. QED
Question 1: The assumption that $\mu$ is finite is only relevant for the very last step $\mu(A_0)-\mu(f^{-n}(A_0))=0$, so we don't get $\infty-\infty$, correct?
Question 2: I'm trying to see why $B=A\cap\bigcap_{n=1}^{\infty}A_n$: If $x\in B$, then $x\in A$ and $f^m(x)\in A$ for infinitely many $m$. Let $I=\{m_1, m_2,...\}$ be the sequence of such $m$. Then $x\in A\cap\bigcap_IA_I\subseteq A\cap\bigcap_{n=1}^{\infty}A_n$. So $B\subseteq A\cap\bigcap_{n=1}^{\infty}A_n$. Conversely, let $x\in A\cap\bigcap_{n=1}^{\infty}A_n$. Suppose $f^k(x)\in A$ for only finitely many $k$. Let $M$ be the largest such $k$. Then since $x\in A\cap A_{M+1}$, $f^N(x)\in A$ for some $N>M$. Contradiction, hence, there are infinitely many $k$'s and so $A\cap\bigcap_{n=1}^{\infty}A_n\subseteq B$. Is this right?
Question 3: Why, in the first two equalities of the proof, does the index begin at $1$ instead of $0$?
Q1: Heuristically speaking, PRT is a measure-theoretic pigeonhole principle of sorts, iterations of a measure-preserving map exhaust the whole space eventually, which is why there is recurrence (consider a translation on $\Bbb{R}$). For practical purposes finiteness of the measure is required for the so-called "excision property": if $A\subseteq B,$ then $\mu(B\setminus A)=\mu(B)-\mu(A)$ (the standard proof of this goes like this: $\mu(B)=\mu(B\setminus A)+\mu(A) \implies \mu(B)-\mu(A)=\mu(B\setminus A)$, where cancellation makes sense provided that $\mu(A)$ is a number).
Q2: Your argument is correct, but perhaps there is a more straightforward argument:
\begin{align} B&\stackrel{\tiny\mbox{def}}{=}\{x\in A\mid f^n(x)\in A\mbox{ FIM } n\} =\{x\in A\mid \exists n_k\subseteq n: f^{n_k}(x)\in A\}\\ &=\{x\in A\mid \forall n,\exists k\geq n: f^k(x)\in A\}=\bigcap_{n\geq1}\bigcup_{k\geq n}A\cap f^{-k}A= A\cap \bigcap_{n\geq1}\bigcup_{k\geq n}f^{-k}A. \end{align}
Q3: It could, but since we are taking the intersection with $A$, $n=0$ is redundant:
\begin{align} A\cap\bigcap_{n\geq0}A_n&= A\cap A_0 \cap \bigcap_{n\geq1}A_n \\ &= A\cap \left(f^{-0}A\cup\bigcup_{n\geq1}f^{-n}A\right)\cap\bigcap_{n\geq1}A_n \\ &\quad\quad\quad\quad= A\cap (A\cup A_1)\cap\bigcap_{n\geq1}A_n = A\cap\bigcap_{n\geq1}A_n. \end{align}
P.S.: For future reference I believe the book you are following is Barreira & Valls' Dynamical Systems: An Introduction.