The visual intuition I use for understanding conditional probability seems to be breaking down!
I always understood the conditional probability $P(A|B)$ like this:
$P(A|B)$ tells us if our outcome (in the overall sample space represented by the outer box) is in the event B, what is the probability that it is also in the event A?
This quantity $P(A|B)$ will be high if they are almost overlapping (the sets A and B contain similar outcomes) since being in B means a high probability it is also in A.
The extreme case is when B and A are overlapping completely, in a way such A = B, which makes $P(A|B)$ = $P(A|A) = 1$
On the other way, as the circles get farther apart and the events share fewer outcomes in common, the conditional probability diminishes.
It also gives a nice explanation for the formula for conditional probability, which is $$(P(A|B) = \frac{(A \cap B)}{P(B)}$$
Which we can kind of understand as "the overlap between the two events" (the intersection) normalized by the entirety of the "area" of the event B.
It all makes sense and has been helping me grasp these ideas of conditional probability a lot, but it's beginning to fall apart with the ideas of independence vs. disjointness.
Two events are disjoint if their intersection is the empty set.
Now, everything is telling me that since these two are disjoint, and not touching each other, that they should be independent. Being in B no longer tells us anything about whether you're in A or not. But by definition, they cannot be independent -
$$P(A|B) > 0$$ $$since$$ $$\text{if independent: } P(A \cap B) = P(A)P(B)$$ $$\text{if disjoint: } P(A \cap B) = 0$$ $$\text{ therefore if P(A) is not 0 and P(B) is not 0 and neither B or A are empty sets independence } \ne \text{ disjointness}$$
This breaks down the whole visual intuition! Does anyone have any nice ways of reconciling this fact or alternate visual intuitions that can help guide me? It's really easy for me to get lost without things like this.
Thanks!



"Now, everything is telling me that since these two are disjoint, and not touching each other, that they should be independent." is incorrect (and very common mistake by students!)
Disjoint means that they can't happen simultaneously.
Independent means that knowing that one happened does not affect the probability that the other happened $\Pr(A\vert B)=\Pr(A)$.
If they are disjoint, you know that they can't happen simultaneously (no overlap) and if $B$ happened, clearly $A$ didn't.
The way to correct the visual intuition: $A$ and $B$ independent if the area that $A$ cuts out of $B$ is the same as it cuts out of the entire square. So when you know $B$ happened (the "new sample space") the part of $A$ inside is the same as it was relative to the original sample space. If they are disjoint, this is clearly not the case.