How does scaling $\Pr(B|A)$ with $\Pr(A)$ mean multiplying them together?

110 Views Asked by At

I already read this, and so wish to intuit 3 without relying on (only rearranging) the definition of Conditional Probability.
I modified the following's source for concision.

$1.$ Now look at $\Pr(A \cap B)$. We know that if $A$ has happened, then $A \cap B$ happens with probability $\Pr(B\mid A)$.

$2.$ If we do NOT know that $A$ has happened, we must $\color{darkred}{SCALE \; \Pr(B\mid A) \text{ with } \Pr(A)}$.
$3.$ Thus, $ \Pr(A \cap B)= \Pr(B\mid A)\Pr(A) \text{.}$

I pursue only intuition; please do not answer with formal proofs.

I do not understand 2. How does $\color{darkred}{SCALING \; \Pr(B\mid A) \text{ with } \Pr(A)}$ translate into multiplying them both together? For example, why does 'scaling' not imply addition?

2

There are 2 best solutions below

2
On

Suppose you were to grab the edges of $A$ and stretch it out so it covers all of $\Omega$. $B$ stretches out with it. Now we ask, what proportion of $\Omega$ is covered by the intersection of $A$ and $B$? The answer is simply the proportion of $A$ covered by $B$. The proportional stretching of things doesn't change the proportion of $A$ covered by $B$ so this value is still $P(B|A)$.

But, $A$ isn't actually the whole universe. We have to shrink everything back down so it's all in the original size relative to $\Omega$. To do this, we shrink everything back down by a factor of $P(A)$ (the true size of $A$ within $\Omega$.

Now when we ask, what is the size of the the portion of $B$ that overlaps $A$ relative to the entire universe $\Omega$, the answer is $P(B|A)$ shrunk down by $P(A)$ which is $P(B|A)P(A)$

4
On

Scaling means change of scale.   That's multiplicative, not additive.

$\mathsf P(A)$ is the measure of $A$ within the universe ($\Omega$).

$\mathsf P(B\mid A)$ is the measure of $B$ within $A$.

$\mathsf P(B\cap A)$ is the measure of $B\cap A$ within the universe.

  1. So if we scale the measure of $B$ within $A$ by the measure of $A$ within the universe, we get the measure of the union, $B\cap A$, within the universe.

We change the scale, from $A$ to the universe.   $\mathsf P(B\mid A)~\mathsf P(A) = \mathsf P(B\cap A)$.


As an example, suppose: There are five cups on the table.   One cup is half full: call this cup $C.$  

The amount of water in $C$ is one tenth of the capacity of all cups on the table. Denote the existence of water by $W$.   Let $V$ be the volume measure then:   $V ( W\mid C ) =\tfrac 12, V( C )=\tfrac 1 5, V( C\cap W) = \tfrac 1{10}$


[This question from the OP refers to an older version of this answer:]
2. What is $\mu$ in your last paragraph?

Measure.   It is convention to use $\mu$ as the generic symbol for "a measure function".   I suppose I should have used, and now have changed to, $V$-olume since that is more specific.

  1. I still do not understand 1. How is 1 true? – LePressentiment

Why should it not be true?   $A\cap B$ is the overlap of $B$ and $A$.   The amount of $B$ measured with respect to $A$ is then the amount of $B\cap A$ divided by the amount of $A$ itself.

There are 301 students within the school, divided evenly as possible into 3 houses of 100 students each, except Green House which has 101.   There are 138 female students in the school, of which 49 are in Green House.   What proportion of students in Green House are female?

Letting $F$ represent the set of Female Students and $G$ represent the set of Students in Green House, and $n()$ be the count of students within the school, or given condition, we have: $$n( F\cap G) = 49, n(G) = 101,\text{ so then }n(F\mid G) = 49/101$$

Alternatively, letting $p$ be the proportion of students (within the school, or given condition), we have: $$p(F\cap G) = 49/301, \; p(G) = 101/301,\text{ so then }p(F\mid G) = 49/101$$

That is all conditional probability is: a proportionate measure.

$\mathsf P(B\mid A)$ is "the proportionate measure of probability of outcomes that also result in event $B$, with respect to those resulting in event $A$."