I am trying to understand the demonstration of the Borel-Cantelli lemma:
Let $(A_n)$ be a sequence of independent events. If $ \sum_n P(A_n) = \infty $, then $P( \lim \sup A_n ) = 1$
To show that $ P( ( \lim \sup A_n ) ) = 1 $, we have to prove that $ P( ( \lim \sup A_n )^c ) = 0 $:
$ P( ( \lim \sup A_n )^c ) = P( \cup_p \cap_{n \geq p} A_n^c ) \leq \sum_p P( \cap_{n \geq p} A_n^c ) $.
Then the demonstration focuses on showing that $P( \cap_{n \geq p} A_n^c ) = 0 ~ \forall p$, using $ 1-x \leq \exp(-x) $. Then it says that $ \sum_p P( \cap_{n \geq p} A_n^c ) = 0 $, and therefore $ P( ( \lim \sup A_n )^c ) = 0 $ which is what we wanted.
The part that is unclear to me is how we go from $$P( \cap_{n \geq p} A_n^c ) = 0 ~ \forall p$$ to $$ \sum_p P( \cap_{n \geq p} A_n^c ) = 0 $$ Why do we jump from the finite case to infinity? What happened?
===== EDIT =====
I read the proof of the lemma on the Wikipedia, and I don't understand it either. I guess the part that I fail to understand is the same in both versions:
To show that $\lim_N P( \cap_{n \geq N} E_n^c) = 0 $, Wikipedia says:
it is enough to show: $ P( \cap_{n \geq N} E_n^c) = 0 $
There again, I don't understand. Why is something true for all $N$, also true at infinity?