Decomposing Joint Probabilities using the Chain Rule

1.9k Views Asked by At

Just for learning I am attempting to write out the decomposition of a joint probability using the chain rule for fun. I am attempting the basic example posted on Wikipedia.

This example is the following:

$P(A_4\,\cap\,A_3\,\cap\,A_2\,\cap A_1)$

How exactly do we use the chain rule to decompose this? Perhaps someone could show the first step or two so I could get the idea and the basic intuition on how to proceed? Normally, I am more familiar with using the chain rule on functions in calculus which is very clear (take the derivative of the outside of the function leaving the inside unchanged and then multiplying by the derivative of the inside of the function). I am not exactly sure how you would apply that same logic here.

Thanks.

1

There are 1 best solutions below

1
On BEST ANSWER

Although the term "chain rule" is sometimes used in probability, it is not the same chain rule as you learned in calculus, so that might be part of your confusion. I prefer to think of the "chain rule" in probability as applying the definition of conditional probability.

The definition of the conditional probability of $A$ given $B$ is: $$P(A \mid B) = \frac{P(A \cap B)}{P(B)}.$$ Rearranging this yields: $$P(A \cap B) = P(A\mid B)P(B).$$ This tells us how to decompose $P(A \cap B)$ as a product of probabilities. But you are given the probability of an intersection of four events, not two. However, treating $A_4$ as $A$ and $A_3 \cap A_2 \cap A_1$ as $B$ in the formula above, you can write: $$P(A_4 \cap A_3 \cap A_2 \cap A_1) = P(A_4 \mid A_3 \cap A_2 \cap A_1) P(A_3 \cap A_2 \cap A_1).$$ Now we can treat $A_3$ as $A$ and $A_2 \cap A_1$ as $B$ in the formula to write: $$P(A_3 \cap A_2 \cap A_1)=P(A_3 \mid A_2 \cap A_1)P(A_2 \cap A_1).$$ Finally, apply the formula one more time to $P(A_2 \cap A_1)$ to get: $$P(A_2 \cap A_1) = P(A_2 \mid A_1) P(A_1).$$ Putting this all together, \begin{align*} P(A_4 \cap A_3 \cap A_2 \cap A_1) &= P(A_4 \mid A_3 \cap A_2 \cap A_1) P(A_3 \mid A_2 \cap A_1)P(A_2 \mid A_1)P(A_1). \end{align*} Note that we could have chosen to "peel off" the $A_i$'s in any order we want. So we could just as well say: \begin{align*} P(A_4 \cap A_3 \cap A_2 \cap A_1) &= P(A_2 \mid A_3 \cap A_4 \cap A_1) P(A_3 \mid A_4 \cap A_1)P(A_4 \mid A_1)P(A_1). \end{align*}