The proposition:
Let $p$ be a non-zero prime ideal of $A$, the ring $B/pB$ is an $A/p$ algebra of degree $n=[L:K]$, isomorphic to the product $\prod_{P\mid p}B/P^{e_P}$. We have the formula $n=\sum_{P\mid p}e_Pf_P$
Then here is his proof:
Let $S=A-p$, let $A'=S^{-1}A$ and $B'=S^{-1}B$. The ring $A'=A_p$ and so is a DVR, and $B'$ is its integral closure in $L$. One has $A'/pA'=A/p$ and one sees easily $B'/pB'=B/pB$.
Q1: I understand that $B'$ should be the integral closure of $A'$ as $B$ is the integral closure of $A$. However, why do we have $A'/pA'=A/p$? Shouldn't it be isomorphic? I would assume this is just a typo as otherwise it really doesn't make any sense to me.
As $A'$ is principal, hypothesis $(F)$, i.e. $B$ is finitely generated $A$-module, shows that $B'$ is a free module of rank $n=[L:K]$ and $B'/pB'$ is free of rank $n$ over $A'/pA'$. Thus $B/pB$ is an algebra of degree n.
Q2: I get that if $B$ is f.g. $A$-module then $B'$ should be finitely generated $A$-module, however, why is it of rank $n$ and why is it free? Is Serre using structure theorem of f.g. module over PID here? Then we can make some arguments to show it is torsion-free and so it must be free?
Since $pB=\bigcap P^{e_P}$, the canonical map $$B/pB\mapsto > \prod_{P\mid p}B/P^{e_P}$$ is injective. The approximation lemma shows that it is surjective. Hence it is an isomorphism.
Q3: Okay, what is the canonical map here? Are we talking about direct products when we say $\prod_{P\mid p}B/P^{e_P}$? In that case, can we invoke Chineses Remainder Theorem (those primes should be maximal since Dedekind domain condition) instead of using the approximation lemma? If it is not a direct product, how is this product even make sense? Also, how is the approximation lemma be used here? Since I don't even understand what the map is, I have no clue about this either.
By comparing degrees, one sees that $n$ is the sum of the degrees $$n_P=[B/P^{e_P}:A/p]$$ One has $n_P=\sum_{i=0}^{e_P-1}[P^i/P^{i+1}:A/p]=e_P\cdot [B/P:A/p]=e_Pf_P$, which proves the proposition.
I have seen a proof showing this formula using norms on the ideals, and it makes much more sense than this one... If anyone could help clear some of my confusion, I would be much appreciated.
Just a general comment: you may want to learn these basics from a standard introduction to algebraic number theory (e.g. Marcus). Serre's book is excellent but this background material is not its main focus and hence it goes very quickly through it, assuming the reader is already roughly familiar with it.
Q1: They are canonically isomorphic to each other, and both isomorphic to the residue field so I think it's ok to call them equal. In particular one can always take the cosets in $A'/pA'$ to be represented by elements of $A$, and the arithmetic then exactly matches that of $A/p$.
Q2: Yes, this follows directly from the structure theorem for modules over a PID. It is clear that $B'$ is torsion-free because it is an integral domain and contains $A'$ (a torsion element would be a zero divisor). The rank is at most $n$ because the degree of $L/K$ is $n$ as a $K$-vector space and $K$-dependence can be transformed into $A'$-dependence so there is no independent set over $A'$ of more than $n$ elements in $B'$. Conversely one can take a $K$-basis for $L$ and "clear the denominators" in the same way to put those elements in $A'$. Linear independence over $K$ certainly implies independence over $A'$ since all we've done is shrunk our ring of scalars.
Q3: I would use the Chinese Remainder Theorem here too. I think the Approximation Lemma is a sort-of way of stating the Chinese Remainder theorem, but a little bit opaque.
In each component map is the obvious "reduce further map", by which I mean $$a + \mathfrak p \mapsto a + \mathfrak P^e$$ which is well-defined because $\mathfrak P^e$ contains $\mathfrak p$. E.g. after reducing an element mod $12$ you can always reduce it further mod $4$, for instance, because $(12)$ is contained in $(4)$.
Q4: The reason $n$ is the sum of those degrees is that as abelian groups $$B/pB \cong A^n/pA^n \cong (A/p)^n$$ Whereas on the left, we have $$\prod B/\mathfrak P ^{e_{\mathfrak P}}$$
and so want to write the size of each term of the LHS product as $(A/p)^{n_{\mathfrak p}}$ and compare them to get the desired equality. Q5: The argument by ideal norms is really equivalent to this one if you read the proofs carefully. Here is a slight rephrasing of Serre's argument: to determine the size of $B/\mathfrak P^{e_\mathfrak P}$, notice it is a $ B / \mathfrak P = A/p$-vector space of dimension $e_{\mathfrak P}$ (it has that dimension because it is length $3$ by looking at $\mathfrak P/\mathfrak P^2$, then $\mathfrak P^2/\mathfrak P^3$ and so on, which are each one-dimensional (it is easy to see they are one-dimensional by writing them as principal ideals)). The ground field has $f_{\mathfrak p}$ elements. Evidently a $k$-vector space of dimension $m$ has $|k|^m$ elements. One of the "usual" proofs of the properties of the ideal norm is done by working locally, and passes through exactly this argument about determining to determine $|A/p^r|$.