We have the definition of independence of events $A_1,A_2,\dots,A_n$:
$$\mathbf P\left(\bigcap_{i\in S}A_i\right)=\prod_{i\in S}\mathbf P\left(A_i\right), \forall S\subset\{1, 2, \dots, n\}$$
This means that, for three events $A_1, A_2, A_3$, $\mathbf P(A_1\cap A_2\cap A_3)=\mathbf P(A_1)\cdot \mathbf P(A_2)\cdot \mathbf P(A_3)$ alone is not sufficient for them to be independent because it does not imply the other three criteria:
- $\mathbf P(A_1\cap A_2)=\mathbf P(A_1)\cdot \mathbf P(A_2)$
- $\mathbf P(A_2\cap A_3)=\mathbf P(A_2)\cdot \mathbf P(A_3)$
- $\mathbf P(A_3\cap A_1)=\mathbf P(A_3)\cdot \mathbf P(A_1)$
However, we can say that three random variables are independent if $p_{X,Y,Z}(x,y,z)=p_X(x)\cdot p_Y(y)\cdot p_Z(z),\forall x,y,z$.
Once this equation is satisfied, we have
$$\begin{align} p_{X,Y}(x,y)&=\sum_zp_{X,Y,Z}(x,y,z)\\ &=\sum_zp_X(x)\cdot p_Y(y)\cdot p_Z(z)\\ &=p_X(x)\cdot p_Y(y)\sum_zp_Z(z)\\ &=p_X(x)\cdot p_Y(y) \end{align}$$
Similarly, we have $p_{Y,Z}(y,z)=p_Y(y)\cdot p_Z(z)$ and $p_{Z,X}(z,x)=p_Z(z)\cdot p_X(x)$.
Why is there such a difference in the definitions of indepedence? Is there an intuitive explanation for this?
Looking at events and random variables together we have:
$A_1,\dots, A_n$ are independent events if and only if $1_{A_1},\dots,1_{A_n}$ are independent random variables.
That's quite fair isn't it? You could say that events and random variables are treated the same way when it comes to independence.
But this induces the following definition for independence of events $A_1,\dots,A_n$:$$\Pr(E_1\cap\cdots\cap E_n)=\Pr(E_1)\times\cdots\times\Pr(E_n)\text{ where }E_i\in\{A_i,A_i^c\}\text{ for }i=1,\dots,n$$
So actually it is a cluster of $2^n$ identities.
It is equivalent to the definition you mention in your question and is a stronger condition than the (insufficient) condition $\Pr(A_1\cap\cdots\cap A_n)=\Pr(A_1)\times\cdots\times\Pr(A_n)$.