What does multiplication mean in probability theory?

15.5k Views Asked by At

For independent events, the probability of both occurring is the product of the probabilities of the individual events:

$Pr(A\; \text{and}\;B) = Pr(A \cap B)= Pr(A)\times Pr(B)$.

Example: if you flip a coin twice, the probability of heads both times is: $1/2 \times 1/2 =1/4.$

I don't understand why we multiply. I mean, I've memorized the operation by now, that we multiply for independent events; but why, I don't get it.

If I have $4$ bags with $3$ balls, then I have $3\times 4=12$ balls: This I understand. Multiplication is (the act of) scaling.

But what does scaling have to do with independent events? I don't understand why we scale one event by the other to calculate $Pr(A \cap B)$, if A, B are independent.

Explain it to me as if I'm really dense, because I am. Thanks.

7

There are 7 best solutions below

4
On BEST ANSWER

I like this answer taken from Link :

" It may be clearer to you if you think of probability as the fraction of the time that something will happen. If event A happens 1/2 of the time, and event B happens 1/3 of the time, and events A and B are independent, then event B will happen 1/3 of the times that event A happens, right? And to find 1/3 of 1/2, we multiply. The probability that events A and B both happen is 1/6.

Note also that adding two probabilities will give a larger number than either of them; but the probability that two events BOTH happen can't be greater than either of the individual events. So it would make no sense to add probabilities in this situation. "

0
On

Very informally, suppose that we flip the two coins, say a dime and a quarter, simultaneously $10000$ times. Then the number of times we get a head on the dime should be in the $5000$ range. If there is no "interaction" between the result on the dime and the result on the quarter, to get the approximate number of cases from these $5000$ in which we get a head on the quarter is obtained by scaling $5000$ by a factor of $\frac{1}{2}$.

Remark: Here is a fancier but less intuitive version. Let random variable $X$ be $1$ if the event $A$ occurs, and let $X=0$ otherwise. Define random variable $Y$ analogously. So our mean income if we get a dollar for each head on a dime is $\frac{1}{2}$, as is our mean income if we get a dollar for each head on a quarter. Now assume that the events $A$ and $B$ are independent, and we get a dollar only if both dime and quarter show a head. Then our average income from dimes alone gets scaled by a factor of $\frac{1}{2}$.

1
On

If you randomly pick one from $n$ objects, each object has the probability $\frac{1}{n}$ of being picked. Now imagine you pick randomly twice - one object from a set of $n$ objects, and a second object from a different set of $m$ objects. There are $n\cdot m$ possible pairts of objects, and thus the probability of each individual pair is $\frac{1}{n\cdot m} = \frac{1}{n}\cdot \frac{1}{m}$.


More generally, let $A$ be some event with probability $\Pr(A) = a$, and $B$ some other event with probability $\Pr(B) = b$. Assume you already know that $A$ happened, meaning that instead of looking at the whole probability space (i.e. at the whole set of possible outcomes), we're now looking at only $A$. What can we say about the probability that $B$ happens also, i.e. about the probability $\Pr(B\mid A)$ (to be read as "the probability of $B$ under the condition $A$")?

In general, not much! But, if $A$ and $B$ are independent, then by the definition of independence, knowing that $A$ has happened doesn't provide us with any information about $B$. In other words, knowing that $A$ has happened doesn't make the likelyhood of $B$ happening also any smaller or larger, so $$ \Pr(B\mid A) = \Pr(B) \text{ if $A,B$ are independent.} $$

Now look at $\Pr(A \cap B)$, i.e. the probability that both $A$ and $B$ happen. We know that if $A$ has happened, then $A \cap B$ happens with probability $\Pr(B\mid A)$. If we don't know that $A$ has happened, we have to scale this probability with the probability of $A$. Thus, $$ \Pr(A \cap B)= \Pr(B\mid A)\Pr(A) \text{.} $$ [ You can imagine $A$ and $B$ to be some shapes, both inside some larger shape $\Omega$. $\Pr(A\cap B)$ is then the percentage of the area of $\Omega$ that is covered by both $A$ and $B$, $\Pr(A)$ the percentage of the area of $\Omega$ covered by $A$, and $\Pr(B\mid A)$ is the percentage of the area of $A$ covered by $B$. ]

If $A,B$ are independent, we can combine these two results to get $$ \Pr(A\cap B) = \Pr(A)\Pr(B) \text{.} $$

3
On

You are still scaling, but by numbers that are smaller than $1$. In your example, you are scaling $1/2$ by a factor of $1/2$, scaling it down to $1/4$. The first $1/2$ represents the outcomes where the first coin flipped is heads. But only $1/2$ (the second "$1/2$" from your example) of those outcomes also have the second coin come up heads.

1
On

Your question make me question myself !

And then after a while of thinking I realize why you and I don't understand WHY in the first place.

You don't understand because it is not a MULTIPLICATION. It is in fact a DIVISION. You are multiplying 2 fractions therefore it is in its nature a divison.

Multiplication is just to facilitate mathematics notation.

0
On

There are different ways to understand why multiplication is used; one stems from the origin of the multiplication rule, this explains why multiplication is used and where it comes from. I am afraid the "meaning in itself" of the multiplication isn't going to be illuminating because the multiplication here is derived from a definition (and definitions are assigned and somewhat arbitrary, except that a convention is established.) We could have said the probability of a certain event is π and one that will never occur is -π (the adjust the mathematics accordingly.)

What you are referring to is the multiplication rule of probability. This rule stems from the definition of an event occurring in basic probability. Namely; The probability that an event occurs is equal to the number of ways that it could possibly occur divided by the total number of outcomes. Keep this in mind because this simple idea is used to derive the multiplication rule of probability.

Probability of an event = Number of ways it can happen / total number of outcomes

Given two events (in this case event B happens after event A, and depends on the outcome of A... removing different colored marbles from a bag without replacement for example.)

The probability that event B happens given event A happened = the number of ways that b can happen when A happens / total number of ways that A happens.

Typically a Venn diagram is used to illustrate this (draw intersecting circles, label one of them A, the other one B, and the oval in the middle the "intersection of A and B" where intersection is denoted by a frowning symbol as follows "A ∩ B"

We can restate this in more formal terms as follows:

Pr(B∣A) = A ∩ B / Pr(A)

This is analgous to the definition of the probability of an event. Let me show you them side by side.

Pr(B∣A) = A ∩ B / Pr(A)

Probability of an event = Number of ways it can happen / total number of outcomes

The Probability of the event is Pr(B|A)

The number of ways it can happen is A ∩ B

The total number of outcomes is Pr(A)

This is simply an extension of the definition. Multiply both sides of this equation by Pr(A) and we arrive at a new equation:

A ∩ B = Pr(A)Pr(B|A) ... now you are wondering what is the meaning of "A ∩ B" in this context if it is supposed to refer to "Number of ways it can happen"...so look at the addition rule of probability. As follows:

P(A or B) = P(A) + P(B) - P(A ∩ B) ...Again this is more intuitive with Venn diagrams.

Rearranging the equation P(A ∩ B) = P(A) + P(B) - P(A or B)

2
On

@littleO 's comment visualization attempt. This visualization is based on @littleO 's comment which helped me to understand the probability multiplication rule. enter image description here