Joint Distribution of dependent or independent random variables

225 Views Asked by At

I am aware that there are many questions with the same title, but I'm still confused. If two random variables X and Y were independent, we would say that $P(X=x, Y=y)$ would just by $P(X=x)*P(Y=y)$. But what about when Y is a function of X or vice-versa? How would we exacly understand independence in that case?

For instance, let us assume X is a random variable with values in {0, 1, 2, . . . , n, . . .}. When $Y = 4X + 3$, How would I calculate $P(X ≤ 30|Y ≥ 125)$? Clearly when $X ≤ 30$, $Y$ is always $≤ 125$ and it is not $Y ≥ 125$. So do we consider this dependent or independent?

In $P(X ≤ 30|Y ≥ 125)$ = $P(X ≤ 30,Y ≥ 125)/P(Y ≥ 125)$, is the numerator (which is the joint distribution) not just $P(X ≤ 30)+P(Y ≥ 125)$? Or is it something else like $P(Y ≥ 125|X ≤ 30)$? If it were, how would I calculate it, and why?

To clarify in short, how would I calculate the joint probability of $P(X≤30,Y≥125)$?

The crux of my issue is with understanding dependence or independence while calculating a joint distrbution of random variables, and how I would calculate it if a conditional probability was involved.

2

There are 2 best solutions below

0
On BEST ANSWER

In case there is a one-to-one function from $X$ to $Y$ (like $Y=4X+3$), then knowing $Y$ directly tells you the value of $X$ and vice versa. This means that probabilities like $$P(X=4\mid Y=19), \quad P(X=4\mid Y=18),\quad P(Y=15\mid X=3)$$ are all either $1$ or $0$, either the equation holds or it doesn't (the probabilities above are respectively 1,0,1).

When you know one variable can be a range of values (i.e. $Y\geq 125$), it gets slightly more complicated. Since $Y=4X+3$ means $X=\frac{Y-3}{4},Y\geq 125$ can be directly translated to $X\geq 30.5$. Hence $X\leq 30$ can never happen, and so $P(X\leq 30\mid Y\geq 125)=0$.

Let's make it a bit more interesting with $P(X\leq 30\mid Y\geq 103)$: $Y\geq 103$ means $X\geq 25$, so $P(X\leq 30\mid Y\geq 103)$ is the exact same as $$P(X\leq 30\mid X\geq 25)$$ Using $P(A\mid B)=P(A,B)/P(B)$, this can be rewritten as $$P(X\leq 30\mid X\geq 25)=\frac{P(X\leq 30,X\geq 25)}{P(X\geq 25)}=\frac{P(25\leq X\leq 30)}{P(X\geq 25)}$$ and if you know the distribution of $X$, you should be able to calculate the numerator and denominator.

2
On

Taking into account the definition of independence, I.e, $P(X=x, Y=y) = P( =x)P(Y=y)$ we see that the two are not independent.

Consider n = 35. $P(x = 2) = 1/35$. $P(Y= 83) = 1/35$. $P(X= 1|Y = 83) = 0$.

At a more intuitive level, think about whether the value of one variable gives you information in regards to the outcome a second variable. If the answer is yes, then the two are dependent. In this case, it is easy to see that knowledge of one variable gives us significant information into the second. Use the definition to make the intuition rigorous.