Let's $\mathcal{E}=\{E_1,E_2,\dots,E_n\}$ be a set of events. We say that $\mathcal{E}$ is an independent set of events if the occurrence of any number of the events does not change the likelihood of the occurrence of the remaining events. To formalize this intuitive definition of independence, one can suggest the following definition.
Definition 1. Let $\mathcal{E}=\{E_1,E_2,\dots,E_n\}$ be set of events, $I=\{1,2,\dots,n\}$ be an index set and $\mathcal{C} = \{S \, | \, S \subset I, \,\, 1 \leq |S| \leq n-1 \}$. The set $\mathcal{E}$ is an independent set of events if $$\forall S \in \mathcal{C}, \qquad \forall i \in I-S, \qquad \mathbb{P}(E_i \mid \bigcap_{j\in S} E_j)=\mathbb{P}(E_i),$$ which yields \begin{align} N &= (n-1){{n}\choose{1}} + (n-2){{n}\choose{2}} + \dots+(n-(n-1)){{n}\choose{n-1}} = \sum_{i=1}^{n-1}(n-i){{n}\choose{i}} \\ &= \sum_{i=1}^{n-1}n{{n-1}\choose{i}}=n(2^{n-1}-1) = n 2^{n-1}-n \end{align} equations.
However, I usually see that the probability books use the following definition.
Definition 2. Let $\mathcal{E}=\{E_1,E_2,\dots,E_n\}$ be set of events, $I=\{1,2,\dots,n\}$ be an index set, and $\mathcal{C} = \{S \, | \, S \subset I, \,\, 2 \leq |S| \leq n \}$. The set $\mathcal{E}$ is an independent set of events if $$\forall S \in \mathcal{C}, \qquad \mathbb{P}(\bigcap_{i\in S} E_i)=\prod_{i\in S}\mathbb{P}(E_i),$$ which yields $$N = {{n}\choose{2}} + {{n}\choose{3}} + \dots + {{n}\choose{n}} = \sum_{i=2}^{n}{{n}\choose{i}}=2^n-{{n}\choose{1}} - {{n}\choose{0}} = 2^n - n - 1$$ equations.
When $n=2$, Definition 1 says
$$\mathbb{P}(E_1|E_2) = \mathbb{P}(E_1), \qquad \mathbb{P}(E_2|E_1) = \mathbb{P}(E_2), \tag{1}$$
while Definition 2 says
$$\mathbb{P}(E_1 \cap E_2) = \mathbb{P}(E_1) \mathbb{P}(E_2). \tag{2}$$
Using the definition of conditional probability, one can verify that $(1)$ and $(2)$ are equivalent provided that $\mathbb{P}(E_1) > 0$ and $\mathbb{P}(E_2) > 0$.
Questions
Are these definitions equivalent? If the answer is positive, what would be a nice way to prove it?
As I said, most books use Definition 2. What are the advantages of Definition 2 over Definition 1?
For getting more clues about a proof, it could be useful to write out the details for $n = 3$. In this case, Definition 1 gives
\begin{align} \mathbb{P}(E_1|E_2) &= \mathbb{P}(E_1), \qquad \mathbb{P}(E_2|E_1) = \mathbb{P}(E_2), \tag{1-1}\\ \mathbb{P}(E_2|E_3) &= \mathbb{P}(E_2), \qquad \mathbb{P}(E_3|E_2) = \mathbb{P}(E_3), \tag{1-2}\\ \mathbb{P}(E_3|E_1) &= \mathbb{P}(E_3), \qquad \mathbb{P}(E_1|E_3) = \mathbb{P}(E_1), \tag{1-3} \\ \mathbb{P}(E_1|E_2 \cap E_3) &= \mathbb{P}(E_1), \tag{1-4}\\ \mathbb{P}(E_2|E_3 \cap E_1) &= \mathbb{P}(E_2), \tag{1-5}\\ \mathbb{P}(E_3|E_1 \cap E_2) &= \mathbb{P}(E_3), \tag{1-6} \end{align}
while Definition 2 says
\begin{align} \mathbb{P}(E_1 \cap E_2) &= \mathbb{P}(E_1) \mathbb{P}(E_2), \tag{2-1}\\ \mathbb{P}(E_2 \cap E_3) &= \mathbb{P}(E_2) \mathbb{P}(E_3), \tag{2-2}\\ \mathbb{P}(E_3 \cap E_1) &= \mathbb{P}(E_3) \mathbb{P}(E_1), \tag{2-3} \\ \mathbb{P}(E_1 \cap E_2 \cap E_3) &= \mathbb{P}(E_1) \mathbb{P}(E_2) \mathbb{P}(E_3). \tag{2-4} \end{align}
We also note that by the definition of conditional probability we have
\begin{align} \mathbb{P}(E_i|E_j) &= \frac{\mathbb{P}(E_i \cap E_j)}{\mathbb{P}(E_j)}, \tag{3-1} \\ \mathbb{P}(E_i|E_j \cap E_k) &= \frac{\mathbb{P}(E_i \cap E_j \cap E_k)}{\mathbb{P}(E_j \cap E_k)}, \tag{3-2} \end{align}
for distinct values of $i$, $j$, and $k$ in the set $\{1, \, 2, \, 3 \}$ provided that for every $i$ we have $\mathbb{P}(E_i) > 0$.
A. To show that the set of equations $(1)$ imply the set of equations $(2)$, We use $(3-1)$ in $(1-1)$, $(1-2)$, and $(1-3)$ to get $(2-1)$, $(2-2)$, and $(2-3)$. As an example, we have
\begin{align} \mathbb{P}(E_1|E_2) = \mathbb{P}(E_1) &\implies \frac{\mathbb{P}(E_1 \cap E_2)}{\mathbb{P}(E_2)} = \mathbb{P}(E_1) \\ &\implies \mathbb{P}(E_1 \cap E_2) = \mathbb{P}(E_1) \mathbb{P}(E_2) \tag{4-1} \\ \mathbb{P}(E_2|E_1) = \mathbb{P}(E_2) &\implies \frac{\mathbb{P}(E_2 \cap E_1)}{\mathbb{P}(E_1)} = \mathbb{P}(E_2) \\ &\implies \mathbb{P}(E_1 \cap E_2) = \mathbb{P}(E_1) \mathbb{P}(E_2). \tag{4-2} \end{align}
Now, inserting $(3-2)$ into $(1-4)$, $(1-5)$ and $(1-6)$, and using $(2-1)$, $(2-2)$, and $(2-3)$ that has been established in the previous step, yields $(2-4)$. As an example, we have
\begin{align} \mathbb{P}(E_1|E_2 \cap E_3) = \mathbb{P}(E_1) &\implies \frac{\mathbb{P}(E_1 \cap E_2 \cap E_3)}{\mathbb{P}(E_2 \cap E_3)} = \mathbb{P}(E_1) \\ &\implies \mathbb{P}(E_1 \cap E_2 \cap E_3) = \mathbb{P}(E_1) \mathbb{P}(E_2 \cap E_3) \\ &\implies \mathbb{P}(E_1 \cap E_2 \cap E_3) = \mathbb{P}(E_1) \mathbb{P}(E_2) \mathbb{P}(E_3) \tag{5-1} \end{align}
B. It is also straight forward to show that the set of equations $(2)$ implies the set of equations $(1)$. Divide $(2-1)$, $(2-2)$, $(2-3)$ by a probability $\mathbb{P}(E_i)$ and use $(3-1)$ to get $(1-1)$, $(1-2)$, and $(1-3)$. Divide $(2-4)$ by $\mathbb{P}(E_i) \mathbb{P}(E_j)$ and use $(3-2)$ to get $(1-4)$, $(1-5)$, and $(1-6)$. These are exactly the procedures in $(4)$ and $(5)$, done in reverse order.
We have proved the equivalence of definitions $(1)$ and $(2)$ for the cases of $n = 2$ and $n = 3$. It would be possible to extend this for arbitrary $n$ by using induction on $n$.