Relation between positive correlation and $p(Y_2 > Y_1 \mid X_2 > X_1) > \frac{1}{2}$?

263 Views Asked by At

Looking at another question regarding "intuition" on the sign of the correlation, I was thinking to say positive correlation $\rho(X, Y) > 0$ roughly means if $X$ increases, then $Y$ is more likely than not to increase also. But then I realized the latter could be made precise using a conditional probability: suppose $X$ and $Y$ are random variables on a probability space $A$, and for $i \in \{ 1, 2 \}$ we let $X_i = X \circ \pi_i$, $Y_i = Y \circ \pi_i$ on the product probability space $A \times A$. Then we want to know whether $p(Y_2 > Y_1 \mid X_2 > X_1) > \frac{1}{2}$. And I'm not sure if there might be situations where the correlation is positive, but the conditional probability is strictly less than $\frac{1}{2}$.

So, the question is: is there any implication one way or the other between these two statements? Or, if not, what about the similar idea $E(Y_2 - Y_1 \mid X_2 > X_1) > 0$?

2

There are 2 best solutions below

0
On BEST ANSWER

Consider the case where $A = \{ 1, 2, 3, 4 \}$ with equidistributed probability, and the values of $(X, Y)$ are $(-3.2, -3)$, $(1, 11)$, $(1.1, -4)$, $(1.1, -4)$. Then $\bar X = \bar Y = 0$ so the covariance is equal to $\frac{1}{4} \sum_{i=1}^4 X(i) Y(i) = 2.95 > 0$, which implies the correlation is positive. However, the combinations of $(i, j)$ such that $X(i) < X(j)$ are $(1, 2)$, $(1, 3)$, $(1, 4)$, $(2, 3)$, $(2, 4)$. Out of these, the only combination where $Y(i) < Y(j)$ is $(1, 2)$, so $p(Y_2 > Y_1 \mid X_2 > X_1) = \frac{1}{5}$.

Similarly, $E(Y_2 - Y_1 \mid X_2 > X_1) = \frac{1}{5} (14 + (-1) + (-1) + (-15) + (-15)) = -3.6 < 0$.

Therefore, there is no implication in general between positive correlation and either the conditional probability statement or the conditional expectation statement.

2
On

Summary: Here is a case in which $$\operatorname{corr}(X,Y) \approx 0.99999972 \text{ and } \Pr(Y_1<Y_2 \mid X_1<X_2) = \dfrac{57}{253} < \dfrac 1 2.$$


Suppose $(X_1,Y_1), (X_2,Y_2),(X_3,Y_3),\ldots$ are independent and all belong to the same bivariate distribution and $\operatorname{corr}(X_1,Y_1)>0.$

Can we conclude that $\Pr(Y_2 > Y_1 \mid X_2 > X_1) > \dfrac 1 2 \text{?}$

Suppose $$ (X_1,Y_1) = \begin{cases} (10000,10000) & \text{with probability } 1/30, \\ (-10000,-10000) & \text{with probability } 1/30, \\ (1,-1) & \text{with probability } 14/30, \\ (-1,1) & \text{with probability } 14/30. \end{cases} $$ Let us find $\Pr(Y_2>Y_1\mid X_2>X_1).$ I'm getting $\operatorname{corr}(X_1,Y_1) \approx 0.99999972.$

First look at the space on which we are conditioning: $X_2>X_1.$ $$ (X_1,X_2) = \begin{cases} (X_1,X_1) & \text{probability} \\[12pt] (-10000,-1) & 14/30^3, \\ (-10000,1) & 14/30^2, \\ (-10000,10000) & 1/30^2, \\ (-1,1) & 14^2/30^2, \\ (-1,10000) & 14/30^2, \\ (1,10000) & 14/30^2. \end{cases} $$ We have $14 + 14 + 1 + 14^2 + 14 + 14 = 253.$ So conditional probabilities given this event are $$ (X_1,X_2) = \begin{cases} (X_1,X_1) & \text{probability} \\[12pt] (-10000,-1) & 14/253, \\ (-10000,1) & 14/253, \\ (-10000,10000) & 1/253, \\ (-1,1) & 14^2/253, \\ (-1,10000) & 14/253, \\ (1,10000) & 14/253. \end{cases} $$ In which of these cases where $X_1<X_2$ do we have $Y_1<Y_2\text{?}$ $$ (X_1,X_2) = \begin{cases} (X_1,X_1) & Y_1<Y_2\text{ ?} \\[12pt] (-10000,-1) & \text{true} \\ (-10000,1) & \text{true} \\ (-10000,10000) & \text{true} \\ (-1,1) & \text{false} \\ (-1,10000) & \text{true} \\ (1,10000) & \text{true} \end{cases} $$ Thus $\Pr(Y_1<Y_2 \mid X_1<X_2) = \dfrac{57}{253} < \dfrac 1 2.$

So the order in which $X,Y$ appear, i.e. $X<Y$ or $X>Y,$ is not the only thing that matters: the absolute size of the numbers also matters.