I've learned my entire 2 days of probability theory via wikipedia. So this is bothering me: how is it that
$$P(A | B) = \frac{P(A\cap B)}{P(B)} \tag{1}$$
and two events are called independent when
$$P(A\cap B) = P(A)P(B) \tag{2}$$
In (1), we need $P(A\cap B)$ to be zero to say that $P(A)$ does not depend on $B$, but in (2) it's clear that we need $P(A\cap B)$ to be nonzero to say the same thing: $P(A)$ does not depend on $B$.
If I look at this through the lens of sets, the independence of $A$ and $B$ would imply $A \cap B = \emptyset$. I would expect, like in (1), that $P(\emptyset) = 0$. But it's also kind of obvious—thinking about real life— that independent events can occur at the same time. Is (2) alluding to simultaneity, and (1) alluding to something else, maybe something physical?
I think these are different concepts with similar notations, or maybe I'm just fried.
I disagree with what you say in (1)
We do not need $P(A \cap B) = 0$ to say that $P(A)$ does not depend on $B$. In fact, if it were $0$ then it's almost certain that your events are very much dependent.
As a (fun) example: Let $A$ be the event that eat more than $10$ chicken wings at a wing eating contest. Let $B$ be the event that you eat less than $9$ chicken wings at the event.
We have that $P(A\cap B) = 0$ obviously, which leads to $P(A \mid B) = 0$, but that does not mean that $P(A)$ does not depend on $B$. It certainly does.
Take the formula in (1) and multiply both sides of the equation by $P(B)$. Then we get that $$P(A\cap B) = P(B)P(A\mid B)$$
If we are allowed to substitute $P(A)$ for $P(A \mid B)$, that is, the probability of $A$ is the same whether or not we know that event $B$ occurred, then we can declare independence.