Why is the conditional probability formula not the intersection of A and B over A, rather than the intersection over P(A)?

576 Views Asked by At

The conditional probability of $A$ given the occurrence of $B$ is how likely $A$ is to have occurred given that $B$ has occurred. The formal definition of the conditional probability of $A$ given $B$ is $$P(A|B)=\frac{P(A\cap B)}{P(B)}$$

I get the numerator term: when we reduce the sample space to just occurrences of $B$, then the occurrences of $A$ will be all and only those occurrences of $A$ which are also occurrences of $B$ - i.e. the intersection of $A$ and $B$.

I don't get the denominator term, though. If we know that $A$ has occurred, then $B$ becomes the sample space. The probability of an event is defined to be the long-run relative frequency of favourable outcomes to all possible outcomes (i.e. the sample space). That is: $$P(A) = \frac{A}{\Omega}$$

But if $B$ becomes the sample space, then why are we assigning a probability to $B$, as opposed to dividing by the number of occurrences of $B$? In other words, should we not have: $$P(A|B)=\frac{A\cap B}{B}$$

The probability of $B$ here is just $1$, is it not? What have I missed?

3

There are 3 best solutions below

0
On

In the approach you are thinking of where you are "counting the number of favourable outcomes", it should really be the cardinality or size of the set: i.e. you $\mathbb{P}(A) = \frac{|A|}{|\Omega|}$ (where these all make sense). Now you suggest the conditional probability should be of the form $\mathbb{P}(A \mid B) = \frac{|A \cap B|}{|B|}$; compare this to the more general definition $\mathbb{P}(A \mid B) = \frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)}$. These are essentially the same thing, when our probabilities are defined by counting, i.e. $\mathbb{P}(A \cap B) = \frac{|A \cap B|}{|\Omega|}$ and $\mathbb{P}(B) = \frac{|B|}{|\Omega|}$ (sub them in!)

0
On

Recall that $P(A|B)P(B) = P(B|A)P(A) = P(A \cap B)$.

In other words, the probability of A is conditional upon B occurring, meaning that the probability of B must be accounted for in the expression. Really, you can think of Bayes' rule in terms of simple algebra on this.

$P(A|B)P(B) = P(A \cap B) \rightarrow\ P(A|B) = \frac{P(A \cap B)}{P(B)}$

If A and B are independent, in contrast, one could simply write $P(A)P(B) = P(A \cap B)$, because in this case $P(A|B) = P(A)$.

The remark about dividing by the number of occurrences of B sounds like perhaps you were thinking of the relative frequency among equally probable outcomes, whereas Bayes' rule applies more generally.

4
On

The probability of B here is just 1, is it not?

Why would you think that? $B$ is a subset of the outcome space, so has some probability between $0$ and $1$ (excluding $0$; because we don't want to divide by that). $$\mathsf P(B)\in(0..1]$$

Note: $\mathsf P(B)$ is not the probability of $B$ when given $B$. Rather it is the probability of $B$ when given no constraints.

If you like: $\mathsf P(B)=\mathsf P(B\mid\Omega)$ and so forth, so basically:

$$\mathsf P(A\mid B)=\dfrac{\mathsf P(A\cap B\mid\Omega)}{\mathsf P(B\mid\Omega)}$$

The probability of $A$ when given $B$ equals the probability of the intersection of $A$ and $B$ (when given anything), divided by the probability of $B$ (when given anything).