Is my informal understanding of probability definitions correct?

78 Views Asked by At

I am struggling a little bit on probability and I was hoping someone could clear things up in a slightly informal way. I am new to probability so I am in an awkward position where I need to learn how to solve problems + have an informal understanding of why things work without getting into nitty gritty. I was hoping someone could point out my misunderstanding and answer a question at the end.

Set up:

There is some probability measure $\mu$ on the sample space $\Sigma$ and we have some sigma algebra which I will not talk about. A random variable is a function $X:\Sigma \to \mathbb{R}$. We can define the probability of $X \in A$ as $P(X \in A):=\mu(X^{-1}(A))$. Of course we never actually have access to $\mu$ so we work with the next best thing, which is pdf's and cdf's. By Radon Nikodym there is a function $f$ s.t that for any $A$, $$\int_Af(x)dx=\mu(X^{-1}(A))$$ we call the $f$ the pdf of $X$ and it really just describes $\mu$ "w.r.t" $X$. to my understanding we usually just assume the distribution of $X$ and so the pdf.

Now, what if we have two random variables $X,Y$, we can consider $(X,Y):\Sigma \to \mathbb{R}^2$. We can analogously define probabilities of these variables interacting. $$P(X<a,Y<b):=\mu((X\times Y)^{-1}((-\infty,a)\times(-\infty,b))$$ This makes sense since the pre-image still lives in $\Sigma$ and we are essentially checking $w.r.t$ the measure $\mu$ what "proportion" of the sample space satisfies $X<a,Y<b$. Again, by Radon Nikodym there is some function s.t $$P(X \in A,Y\in B)=\int_{A\times B}f(x,y)dxdy$$ where we call $f$ the joint distribution of $X,Y$.

Here I do have a question. Is it obvious that this definition overlaps with normal definition of random variables.

For instance consider $Z=Y+X$ random variable, it has some distribution so $P(Z>0)=\int_{0}^{\infty}f_Z(x)dx$ is it obvious that this is equal to $\int_{x+y>0}f_{X,Y}(x,y)dxdy$ ? IT just seems to me, while reading my textbook that these "obviously" give the same answer, but that is not clear to me. Is the proof "involved" or actually almost definitional?

Now onto conditional random variable, say $X|Y=k$. Here is where things get weird. Intuitively I want to say $P(X<a|Y=k)=\frac{\mu(X^{-1}(-\infty,a))\cap Y^{-1}(k))}{\mu( Y^{-1}(k))}$ but the denominator will have measure $0$ in many cases. So this is not the way to view this?

Thank you, I hope my ramblings are not undecipherable and I would greatly appreciate any comments.

1

There are 1 best solutions below

0
On BEST ANSWER

For your first question, I think it's mostly definitional, with possible confusion from the product measure on $\mathbb R^2$ compared to integrating along each variable.

For your $(X,Y)$ random variable, we've defined a measure on $\mathbb R^2$ via $\mu \circ (X,Y)^{-1}$ which applies to Borel subsets of $\mathbb R^2$ generated from intersections and unions of rectangles, even if we often apply this to rectangles $A \times B$ as in your section on joint distributions. So we really have $$ \mathbb P((X,Y) \in A) = \int_A f_{X,Y}(x,y) \, d(x,y) $$ on which we can then use Fubini (given positivity and finiteness) to split into integrating with respect to $x$ and $y$ separately, which is how we usually do our integration, so in this case $$ \mathbb P(X+Y>0) = \int_{-\infty}^{+\infty} \int_{-y}^\infty f_{X,Y}(x,y) \, dx \ dy $$

As for conditional probability, you usually force the denominator to have positive probability. If you've seen conditional expectation you can also use that, in the same way that $\mathbb P(A) = \mathbb E[ \chi_A]$ for $\chi_A$ the indicator function of $A$: $$\mathbb P(A\mid \mathcal F) := \mathbb E[\chi_A \mid \mathcal F]$$ for some $\sigma$-algebra $\mathcal F$, which you can take to be pretty much just an event to get $\mathbb P(A \mid B)$.