I often see people write $$P(X \in A) = P(X \in A | Y \in B)P(Y \in B) + P(X \in A | Y \in B^c)P(Y \in B^c)$$
I want to formally justify this in a measure theory setting.
We can write $$P(X \in A) = \int 1_A(X) dP = \int 1_A(X) \left[ 1_B(Y) + 1_{B^c}(Y)\right] dP $$$$= \int 1_A(X)1_B(Y) dP + \int 1_A(X) 1_{B^c}(Y)dP$$ $$ = P(X \in A, Y \in B) + P(X \in A, Y \in B^c)$$
And so we would be done if only I could prove that $$P(X \in A, Y \in B) = P(X \in A | Y \in B) P(Y \in B).$$
This is true by definition in basic probability theory. Is it also true in measure theory?
So I am looking for two things:
- What is the definition of conditional probability in measure theory? Personally, I was only ever introduced to a conditional expectation, not a conditional probability.
- How does one prove that this abstract definition is identical to the equation above?
You're overthinking things. Just like in naive probability theory, in measure theory the definition of $P(A|B)$ is $\frac{P(A\cap B)}{P(B)}$. Of course, this only works when $P(B)>0$, but that's a separate matter. As long as $P(B)>0$, your equation (the "total probability formula") is entirely justified in measure theory since it just falls out algebraically once you plug in the definition of conditional probability.