I'm taking a probability and stats course and I'm very confused about the fact that mutually exclusive events are necessarily dependent.
The way I conceptualize dependence/independence is whether one event directly affects the other. For example, if it is raining, I am more likely to bring an umbrella. Or if I pick a king from a deck or cards, I'm less likely to pick another king on my next go than if I had picked a non-king as my first card. Both of these examples have a sense of events being sequential. This makes it easy for me to understand that their non-mutually-exclusive events are dependent, that is, that $P(A|B) \ne P(A).$
However, consider the rolling of a die, where rolling an even number is event $A,$ and rolling an odd number is event $B.$ These events are clearly mutually exclusive, but I can't wrap my head around why they're also dependent. I don't get how the aforementioned formula works. How could you ever calculate the probability of the event of rolling an even number given the fact that you've rolled an odd number, if they're happening at the exact same time? How can two mutually exclusive events be dependent given that they are two results of the same experiment?
Perhaps this is just something that I'm looking at the wrong way, but I can't grasp the concept of events being dependent on each other without being sequential in some way. Has anyone else experienced this sort of confusion or have a way to grasp the concept?
Let's temporarily forget about independence and mutual exclusivity, and first address the pressing fundamental error in your conceptualisation. Now, consider a probability experiment with sample space $\{ttt,htt,tht,tth,hht,hth,thh,hhh\}.$ Note that an event is literally a subset of the sample space:
Observe from the above that it does not generally make sense to frame events as existing at timepoints, let alone as being in sequence. After all, every event is just a specific ‘snapshot’ of associated outcomes! Elaboration here and here. By ‘sequential’, you are thinking only of events like
As is clear by now, this framing works only for a strict subset of all the possible cases. An accurate characterisation of independence—corresponding to $P(B|A) = P(B)$ —is this:
This characterisation (applicable only when $A$ has a nonzero probability) is actually a consequence of the following definition (applicable regardless of $A$'s probability):
event of raining $=\{\color{brown}{ru},r\overline u\}$
event of bringing umbrella $=\{\color{brown}{ru},\overline ru\}$
$A=$ event of obtaining even $=\{2,4,6\}$
$B=$ event of obtaining odd $=\{1,3,5\}$
$C=$ event of obtaining multiple of $3$ $=\{3,6\}$
$D=$ event of obtaining multiple of $5$ $=\{5\}$
By definition,
Depending on how you are intuiting the concept of independence, you may be surprised by some of the above bullets!
By definition! $P(A|B)=\dfrac{P(A\cap B)}{P(B)}=0.$
More accurately,
This is corroborated by the definitions of independence and mutual exclusivity. More concretely, suppose that event $A$ is Tail and event $B$ is neither Head nor Tail. Knowing that $A$ happens does not change $B$'s zero probability, so $A$ and $B$ are independent; yet $A$ and $B$ are also mutually exclusive.