Finding conditional distribution in matching ordering situation

68 Views Asked by At

Suppose we draw two values $x_1,x_2$ according to a CDF $F$. Independently, we draw another two values $y_1,y_2$ according to another CDF $G$. Both $F$ and $G$ has support $[0,1]$.

Among those four values, I first observe $x_1$ only. And then, I get to observe one of $y_1$ and $y_2$, depending on whether $x_1\leq x_2$ or $x_1>x_2$ following the rule below:

--- If $x_1\leq x_2$, then I observe $y_1$ or $y_2$, whichever is smaller (or equal to) than the other.

--- If $x_1>x_2$, I observe $y_1$ or $y_2$, whichever is greater than the other.

So, if my $x_1$ is smaller than the other $x$, my observation of $y$ equals the smaller value of $y_1$ and $y_2$. If my $x_1$ is larger than the other $x$, the $y$ I observe is the larger one between $y_1$ and $y_2$.

In this case, if I observe $x_1$ and some value $y$, what is the conditional distribution of the other value of $y$? (so, if $y=y_1$, then what is the distribution of $Y_2$?)

2

There are 2 best solutions below

0
On BEST ANSWER

Let $Y$ be whichever of $Y_1, Y_2$ you observe, and let $Z$ be the one that you don't observe.

You want $$\mathbb{P}(Z \leq z \mid X_1 = x, Y = y).$$

The event $Z \leq z$ either happens with $X_1 \leq X_2$ or with $X_1 > X_2$, so the probability we want is equal to

$$\mathbb{P}(Z \leq z \ \wedge X_1 \leq X_2 \mid X_1 = x, Y = y) + \mathbb{P}(Z \leq z \ \wedge X_1 > X_2 \mid X_1 = x, Y = y).$$

In each case we know whether $Y, Z$ are the maximum or minimum of $Y_1, Y_2$, so the probability equals

$$\mathbb{P}(\max(Y_1, Y_2) \leq z \ \wedge X_1 \leq X_2 \mid X_1 = x, \min(Y_1, Y_2) = y) + \\ \mathbb{P}(\min(Y_1, Y_2) \leq z \ \wedge X_1 > X_2 \mid X_1 = x, \max(Y_1, Y_2) = y).$$

The $X_1, X_2$ variables are independent of the $Y_1, Y_2$ random variables so we can split up the conditional probabilities as

$$\mathbb{P}(X_1 \leq X_2 \mid X_1 = x)\mathbb{P}(\max(Y_1, Y_2) \leq z \mid \min(Y_1, Y_2) = y) + \\ \mathbb{P}(X_1 > X_2 \mid X_1 = x)\mathbb{P}(\min(Y_1, Y_2) \leq z \mid \max(Y_1, Y_2) = y).$$

Calculating the $X_1, X_2$ terms is straightforward assuming that there is no specific point $c$ with $\mathbb{P}(X_2 = c) > 0$:

$$\mathbb{P}(X_1 \leq X_2 \mid X_1 = x) = \mathbb{P}(x \leq X_2) = 1 - F(x), \textrm{ and } \ \mathbb{P}(X_1 > X_2 \mid X_1 = x) = F(x).$$

By symmetry (see note at the end for more detail), \begin{align} \mathbb{P}(\max(Y_1, Y_2) \leq z \mid \min(Y_1, Y_2) = y) &= \mathbb{P}(Y_2 \leq z \mid Y_1 = y, Y_1 = \min(Y_1, Y_2)) \\\\ &= \mathbb{P}(Y_2 \leq z \mid Y_2 \geq y) \\\\ &= \frac{\mathbb{P}(y \leq Y_2 \leq z)}{\mathbb{P}(Y_2 \geq y)} \\\\ &= \max\left(0, \frac{G(z) - G(y)}{1 - G(y)}\right). \end{align}

The other term can be calculated as: \begin{align} \mathbb{P}(\min(Y_1, Y_2) \leq z \mid \max(Y_1, Y_2) = y) &= \mathbb{P}(Y_2 \leq z \mid Y_1 = y, Y_1 = \max(Y_1, Y_2)) \\\\ &= \mathbb{P}(Y_2 \leq z \mid Y_2 \leq y) \\\\ &= \min\left(1, \frac{\mathbb{P}(Y_2 \leq z)}{\mathbb{P}(Y_2 \leq y)}\right) \\\\ &= \min\left(1, \frac{G(z)}{G(y)}\right). \end{align}

Putting all this together we have

$$\mathbb{P}(Z \leq z \mid X_1 = x, Y = y) = (1 - F(x))\max\left(0, \frac{G(z) - G(y)}{1 - G(y)}\right) + F(x)\min\left(1, \frac{G(z)}{G(y)}\right).$$


The $Y_1, Y_2$ terms can be solved by using \begin{align} \mathbb{P}(\max(Y_1, Y_2) \leq z \mid \min(Y_1, Y_2) = y) &= \mathbb{P}(Y_1 \leq z \ \wedge \ Y_2 \leq z \mid \min(Y_1, Y_2) = y) \\\\ &= \mathbb{P}(Y_1 = \min(Y_1, Y_2))\mathbb{P}(Y_2 \leq z \mid Y_1 = y, Y_1 = \min(Y_1, Y_2)) \ + \\\\ &\mathbb{P}(Y_2 = \min(Y_1, Y_2))\mathbb{P}(Y_1 \leq z \mid Y_2 = y, Y_2 = \min(Y_1, Y_2)). \end{align}

Since $\mathbb{P}(Y_1 = \min(Y_1, Y_2)) = \mathbb{P}(Y_2 = \min(Y_1, Y_2)) = 1/2$, and, by symmetry, $\mathbb{P}(Y_2 \leq z \mid Y_1 = y, Y_1 = \min(Y_1, Y_2))$ equals $\mathbb{P}(Y_1 \leq z \mid Y_2 = y, Y_2 = \min(Y_1, Y_2))$, we have

$$\mathbb{P}(\max(Y_1, Y_2) \leq z \mid \min(Y_1, Y_2) = y) = \mathbb{P}(Y_2 \leq z \mid Y_1 = y, Y_1 = \min(Y_1, Y_2)).$$

2
On

[EDIT]

Let $Y$ be the unobserved $y_1$ or $y_2$. Let $y$ be the observed value among the two. If you know only $x_1$ and $y$:

$$ \mathbb{P}[Y\leq t|x_1,y] $$ $$ =\mathbb{P}[Y\leq t|x_1\leq x_2,x_1,y]\mathbb{P}[x_1\leq x_2|x_1,y] +\mathbb{P}[Y\leq t| x_1>x_2,x_1,y]\mathbb{P}[x_1>x_2|x_1,y] $$ Simplifying: \begin{equation} =\mathbb{P}[Y\leq t|x_1\leq x_2,y]\mathbb{P}[x_1\leq x_2|x_1] +\mathbb{P}[Y\leq t| x_1>x_2,y]\mathbb{P}[x_1>x_2|x_1] \end{equation}

If $x_1\leq x_2$, the missing $Y$ is $\max\{y_1,y_2\}$ Else, the missing $Y$ is $\min\{y_1,y_2\}$.

Part 1, calculation of $\mathbb{P}[Y\leq t|x_1\leq x_2,y]$. We are interested in the CDF of $Y:=\max\{y_1,y_2\}$, conditional on $Y\geq \min\{y_1,y_2\}=:y$. In symbols, we want to know for any $t\in\mathbb{R}$ the value of: $$ \mathbb{P}[Y\leq t|x_1\leq x_2,y]=\mathbb{P}[Y\leq t|Y\geq y] $$ This can be found as follows: $$ =\frac{\mathbb{P}[Y\leq t\cap Y\geq y]}{\mathbb{P}[Y\geq y]} =\frac{\mathbb{P}[y\leq Y\leq t]}{\mathbb{P}[Y\geq y]} $$

$$ =\begin{cases} \frac{\mathbb{P}[Y\leq t]-\mathbb{P}[Y<y]}{1-\mathbb{P}[Y< y]}, & \text{if }y\leq t,\\ 0,&\text{else}. \end{cases} $$ To use your notation for the CDF, we assume $\mathbb{P}[Y\leq t]=\mathbb{P}[Y<t]$ (in other words, $G$ is absolutely continuous). Then: $$ =\begin{cases} \frac{G(t)-G(y)}{1-G(y)}, & \text{if }y\leq t,\\ 0,&\text{else}. \end{cases} $$

If $F$ is also absolutely continuous, the first term in the initial equation is: $$ \mathbb{P}[Y\leq t|x_1\leq x_2,y]\mathbb{P}[x_1\leq x_2|x_1] =\begin{cases} \frac{G(t)-G(y)}{1-G(y)}(1-F(x_1)), & \text{if }y\leq t,\\ 0,&\text{else}. \end{cases} $$

The second term is calculated analogously.