Are there two different notions of "conditional probability"?

209 Views Asked by At

This question comes from reading the discussion here.

(1) If one is given a "probability measure" $P : F \rightarrow [0,1]$ mapping a Borel $\sigma$-algebra $F$ to $[0,1]$ then for two ``random variables"/"probability distributions", $X: O \rightarrow S_1, Y: O \rightarrow S_2$ (mapping the underlying space to "outcomes" say $O$ to some set $S_1$ and $S_2$ respectively and $F$ is a $\sigma$-algebra over $O$) we can define the "conditional probability" as a quantity between the two random variables as the map,

$$P(X\mid Y) : S_1 \times S_2 \rightarrow [0,1]$$ $$(s_1,s_2) \rightarrow \frac { P ( X^{-1}(s_1) \cap Y^{-1}(s_2)) }{ P( Y^{-1}(s_2))}$$

(2) But if $X$ and $Y$ were two "events" i.e $X, Y \in F$ then its equally possible to define a conditional probability by the ``Kolmogorov definition" $$P (X \mid Y ) = \frac { P(X \cap Y) }{ P(Y) } $$

  • Are these two different notions of conditional probability?
2

There are 2 best solutions below

4
On

There are two notions of conditional expectation, however in your question (1) and (2) are of the same kind.

Note that you defined in 1 the probability of an event $A = X^{-1}(s_1)$ given the event $B = X^{-1}(s_2)$.

in (2) you did the same for events $A = X$ and $B = Y$

You can consider the conditional probability of events as you said in (1) or(2) only if you have $P(Y)= 0$. what will you do if $P(Y)= 0$?.

The way to a more general notion of conditional expectation is not immediate and requires several techniques. Conditional expectation given a sigma algebra will be seen as a random variable and will require Radon Nikodym derivtives.

The connection between these two notions is again not immediate, it requires the principle of substitution and the notion of kernels (regular conditional probability distributions)

1
On

The first definition is not right. If $X$ is a random variables, then conventionally one does not define $P(X)$. One may define $P(X\in A)$ where $A$ is some measurable set in the codomain of $X$. This is a probability of an event, not of a random variable. If $X$ is a real-valued random variable, then $A$ would be a measurable set of real numbers.

If $Y$ is a random variable and $A$ is an event, then one can define $P(A\mid Y)$ as a random variable that is a function of $Y$. If $Y$ is a discrete random variable then $P(A\mid Y=y)$ is just an ordinary conditional probability given by your second definition. It is a function of $y$, so one can write $P(A\mid Y=y) = g(y)$ for some function $g$. In that case, one defines $P(A\mid Y)$ to be $g(Y)$. That is a random variable. That its expected value is $P(A)$ is one of the meanings sometimes given to the term "law of total probability". One can also write such a definition when $Y$ is a continuous random variable, and in fact one can define $P(A\mid\mathcal F)$ where $\mathcal F$ is a sigma-algebra of subsets of the probability space, to be an $\mathcal F$-measurable random variable whose integral over each $\mathcal F$-measurable set is the same as the probability of the intersection of that set with $A$.

I haven't read the other question, but I notice that the name Renyi was mentioned, and that alone makes me wonder if maybe some deliberately non-standard definitions were contemplated.