Definition (independence): $(A_i)_{i \in I}$ is called stochastically independent if for all finite non-empty subsets $J \subset I$ we have \begin{align} \mathbb{P}\left(\bigcap_{j \in J} A_j\right) = \prod_{j \in J} \mathbb{P}(A_j). \end{align}
Theorem: $(A_k)_{k = 1}^{n}$ is independent if and only if for each choice of $B_k \in \{A_k, A_k^C\}$, $k \in \{1, \ldots, n\}$ we have \begin{equation*} \mathbb{P}\left(\bigcap_{k = 1}^{n} B_k\right) = \prod_{k = 1}^{n} \mathbb{P}(B_k). \end{equation*}
Proof: "$\impliedby$": By adding the product formulas for $\{B_1, \ldots, B_n\}$ and $\{B_1^C, B_2, \ldots, B_n\}$ we obtain \begin{equation} \tag{1} \mathbb{P}\left(\bigcap_{k = 2}^{n} B_k\right) = \prod_{k = 2}^{n} \mathbb{P}(B_k) \end{equation} Therefore one obtain the formula for intersections of $n - 1$ sets, then $n - 2$ and so on. $\square$
My Question: I don't understand how (1) is obtained and the reasoning following it: If I add $$ \mathbb{P}\left(\bigcap_{k = 1}^{n} B_k\right) = \prod_{k = 1}^{n} \mathbb{P}(B_k) \qquad \text{and} \qquad \mathbb{P}\left(\bigcap_{k = 2}^{n} B_k \cap B_1^C\right) = \prod_{k = 2}^{n} \mathbb{P}(B_k) \cdot B_1^C $$ I get $$ \mathbb{P}\left(\bigcap_{k = 1}^{n} B_k\right) + \mathbb{P}\left(\bigcap_{k = 2}^{n} B_k \cap B_1^C\right) = \prod_{k = 1}^{n} \mathbb{P}(B_k) + \prod_{k = 2}^{n} \mathbb{P}(B_k) \cdot B_1^C = \prod_{k = 2}^{n} \mathbb{P}(B_k) \underbrace{\left[ \mathbb{P}(B_1) + \mathbb{P}(B_1^C) \right]}_{= 1}. $$ But how does the LHS simplify? I guess I have to use the definition but since I only know that the $(A_k)_{k = 1}^{n}$ are independent by hypothesis, what can I say about the independence of $(B_k^{(C)})_{k = 1}^{n}$?
The crucial observation I think is to note that
$\bigcap_{k=1}^n B_k$ and $B_1^\complement \cap \bigcap_{k=2}^n B_k$ are disjoint (a point cannot be both in $B_1$ and its complement $B_1^\complement$) and have $\bigcap_{k=2}^n B_k$ as their union:
$$\bigcap_{k=2}^n B_k = \bigcap_{k=2}^n B_k \cap \left(B_1 \cup B_1^\complement\right) = \left(\bigcap_{k=1}^n B_k\right) \cup \left(\bigcap_{k=2}^n B_k \cap B_1^\complement \right)$$
by the usual distributive law $A \cap (B \cup C) = (A \cap B ) \cup (A \cap C)$ etc. So the sum of probabilities of the two sets is that of the union because of the usual axioms for a probability.