Cannot understand the law for total conditional probability

Question

Cannot understand the law for total conditional probability

369 Views Asked by Bumbble Comm At 04 May 2026 - 2:42

$$P(A)=\sum_n P(A\cap B_n)\tag{1}$$

$$P(A)=\sum_n P(A\mid B_n)P(B_n)\tag{2}$$

I understand completely how $(2)$ follows from $(1)$ since really this is the definition of conditional probabilities, $$P(A\cap B_n)=P(A\mid B_n)P(B_n),$$ Now since, $$P(A_m\cap B_n)=P(B_n\cap A_m)$$ I can rewrite $(2)$ as $$P(B)=\sum_m P(B\mid A_m)P(A_m)\tag{3}$$ where $n, m \in \Bbb{N}$, so the probability for event $B$ to occur, $P(B)$, can be found by looking at this tree diagram for conditional probabilities from Wikipedia,

For the case that $m=2$, using equation $(3)$, $$P(B)=P(B\mid A)P(A)+P(B\mid \bar{A})P(\bar{A})\tag{4}$$

Now here is the problem. Wikipedia then goes on to the analogous formulas for conditional LHS, but fails to mention the origin of the formulas below,

$$P(A \mid C) = \sum_n P(A \mid C \cap B_n) P(B_n \mid C)\tag{5}$$

$$P(A \mid C) = \sum_n P(A \mid C \cap B_n)\tag{6}$$

In a course on foundations of quantum mechanics, I have seen how these formulae ($(5)$ and $(6)$) on Wikipedia were derived (suppressing unnecessary detail):

but I still cannot understand how formula $(1.18)$ was obtained from $(1.17)$.

I have a problem understanding why it is that $P(A \mid C)$ is given by summing over $(1.17)$: $$\sum_jP(A_i \cap B_j\mid C_k ) = \sum_j P(A_i \mid B_j \cap C_k) P(B_j \mid C_k)\tag{7}$$ In order to try to understand this, I put indices on all the events and will assume for simplicity that $i,j,k\in \{1,2\}$, I now construct the tree diagram, which is,

Lets say I wanted to compute $P(A_2)$, then, following the same logic as for the simpler case earlier (equation $(4)$ above), this is given by $$P(A_2)\stackrel{?}{=}P(B_1\mid C_2)P(A_2\mid B_1)+P(B_2\mid C_2)P(A_2\mid B_2)\tag{?}$$ Now, of course I know that the LHS of $(?)$ is actually $P(A_2\mid C_2)$. What do these sums over probability of intersections in $(7)$, $$\sum_jP(\color{red}{A_i \cap B_j}\mid C_k )$$ correspond to in a tree diagram? Specifically, it is the part that is marked red that is causing all the difficulty now, I don't know how to interpret this on the tree diagram above.

Remark: I'm sorry if what I'm asking here is unclear, I am still working on a way to word it better. I have also read this related question but the proof in one of the answers does not address the conditional probabilities case (which is what I am questioning).

Original Q&A

There are 4 best solutions below

Bumbble Comm On 16 Jun 2021 - 6:54

The "law of total probability" its just the fact that if $\mu$ is a measure and $\{A_n\}_{n\in \mathbb N}$ is a sequence of pairwise disjoint $\mu$-measurable sets, then

$$ \mu\left(\bigcup_{n\in \mathbb{N}}A_n\right)=\sum_{n\in \mathbb{N}}\mu(A_n)\tag1 $$

Now, a probability function $P$ is a measure and if we have a sequence $\{B_n\}_{n\in \mathbb N}$ of pairwise disjoint events such that $\bigcup_{n\in \mathbb{N}}B_n=\Omega $, where $\Omega $ is the probability space, then $\bigcup_{n\in \mathbb{N}}(A\cap B_n)=A$ for any event $A$ and the sequence $\{A\cap B_n\}_{n\in \mathbb N}$ is also pairwise disjoint, then from (1) we get the so-called "total law of probability", that is

$$ P(A)=\sum_{n\in \mathbb{N}}P(A\cap B_n)\tag2 $$

This is all. In your question it is not clear what are $A_n, A,B_n$ or $B$, it need context to make sense.

Bumbble Comm On 16 Jun 2021 - 7:24

If for example you were looking for $\color{green}{A_2 \cap B_1}$ it would be the green events shown below, while if you were looking for $\color{purple}{A_2 \cap B_2}$ it would be the purple events shown below.

If you wanted to consider $P(A_2 \mid C_1)= P(\color{green}{A_2 \cap B_1}\mid C_1)+P(\color{purple}{A_2 \cap B_2}\mid C_1)$ then you would concentrate on the left half of the tree involving $C_1$

Bumbble Comm On 05 Sep 2022 - 2:37

On derivation of formula (1.18) is

$ P\left( {A|C} \right) = \frac{{P\left( {A,C} \right)}}{{P\left( C \right)}} = \frac{{\sum\limits_j {P\left( {A,{B_j},C} \right)} }}{{P\left( C \right)}} = \frac{{\sum\limits_j P \left( {A\mid {B_j},C} \right)P\left( {{B_j}\mid C} \right)P\left( C \right)}}{{P\left( C \right)}} = \sum\limits_j P \left( {A\mid {B_j},C} \right)P\left( {{B_j}\mid C} \right) $

**Bumbble Comm** · Accepted Answer

Regarding 1.18: this is actually equation (1) in disguise.

$$P(A \mid C) P(C) = P(A \cap C) = \sum_j P((A \cap C) \cap B_j) = \sum_j P(A \cap B_j \cap C) = \sum_j P(A \cap B_j \mid C) P(C).$$ Dividing by $P(C)$ yields the first equality in (1.18).

Your equation (?) is not correct. I really am not sure what you are going for there. If you are really computing $P(A_2)$, then you can't neglect $C_1$. Some ways to correct this equation are $$P(A_2 \cap C_2) = P(A_2 \cap C_2 \cap B_1) +P(A_2 \cap C_2 \cap B_2)$$ by looking at the $C_2$ tree. If you want to write these in terms of conditional probabilities, you can write \begin{align} P(A_2 \cap C_2) &= P(A_2 \mid C_2) P(C_2) \\ P(A_2 \cap C_2 \cap B_1) &= P(A_2 \mid B_1 \cap C_2)P(B_1 \mid C_2) P(C_2) \\ P(A_2 \cap C_2 \cap B_2) &= P(A_2 \mid B_2 \cap C_2)P(B_2 \mid C_2) P(C_2) \end{align}

Personally I don't think the trees will be too helpful in understanding conditional probabilities. (They can be helpful when considering intersections of events, since each node of the tree corresponds to the intersection of events along the path from the root of the tree to the node.) I find it simplest to reduce conditional probabilities down from the definition, e.g. $P(A \mid B) = P(A \cap B) / P(B)$; everything should follow pretty simply if you apply this definition.

Cannot understand the law for total conditional probability

There are 4 best solutions below

Related Questions in PROBABILITY

Related Questions in SUMMATION

Related Questions in PROOF-EXPLANATION

Related Questions in CONDITIONAL-PROBABILITY

Trending Questions

Popular # Hahtags

Popular Questions