Comparing two random variables with monte carlo sampling

345 Views Asked by At

Suppose there are two numbers X1 and X2 that are from a random continuous probability distribution with unknown range. You are given the value of X1 and you need to determine whether X1 is less than or greater than X2.

Simply guessing would yield a 50% success rate.

How would you develop a method to improve the success rate?

One Possible Answer:

Apparently if you used Monte Carlo sampling, you can do better than 50%. If you compare X1 to a number from a probability distribution that covers the range of the unknown distribution (we can use any normal distribution since its range is -inf, +inf) and use that result to decide whether X1 is > or < X2.

Example:

Lets say we are given X1 and Y ~ N(u, sigma), where u can be any real number and sigma > 0. If X1 > Y, then we would make the guess that X1 > X2, else X1 < X2. If we do this for a large n, our accuracy for X1 would be > 50%.

I ran this simulation and the results are indeed better than 50%. However, I don't quite understand it intuitively and if anyone is familiar with this technique could you please break it down for me? Proofs of this method working for this problem would also be very helpful.

1

There are 1 best solutions below

1
On

Method works! Proof: Let $Y$ be in the interior of the range of $F_X$. Then if $Y\lt X_1$ , there are two possibilities. If $X_2\le Y$, $X_2\lt X_1$ is correct. If $X_2\gt Y$, then $X_2\lt X_1$ half the time. Net result, the probability that $X_2\lt X_1= F_X(Y)+0.5(1-F_X(Y))=0.5(1+F_X(Y))$. An analogous calculation gives a similar result for $Y\ge X_1,\ \ (1-0.5F_X(Y))$.

Note that the distribution for $Y$ doesn't matter as long as $Y$ is within the range of the $X's$. When $F_X(Y)=0\ or\ =1$, it won't work.