Probability that a set of uniformly distributed random variables is 'greater' than another such set.

149 Views Asked by At

Suppose we generate several uniformly distributed random variables (between 0 and 1), and arrange them in descending order to form a set [A1, B1, C1...]. We then do the same process to form a second set [A2, B2, C2...].

Given the number of members in each set (n), what is the probability that the first set is 'fully greater' than the second set, in that A1>A2, B1>B2, C1>C2 etc?

It appears that the probability is 1/(n+1). However, I haven't been able to prove this yet.

The problem can be helpfully reformulated as follows: generate 2n uniformly distributed random variables. Arrange in descending order, and attach numerical label (greatest = 1, second = 2, etc).

There are now 2n! / (n!)^2 ways to pick the first set from this larger set. We now need to count how many of these ways will result in the first set being larger than the second.

When n=2; there are 2 ways (the first set can be [1,2] or [1,3]) out of 6, yielding a probability of 1/3.

When n=3; there are 5 ways ([1,2,3];[1,2,4];[1,2,5];[1,3,4];[1,3,5]) out of 20, yielding a probability of 1/4.

When n=4, there are 14 ways our of 70, a probability of 1/5.

Does this trend continue? Can it be proven?

1

There are 1 best solutions below

0
On BEST ANSWER

Your problem can be interpreted as a special case of Bertrand's ballot theorem, which answers the question: In an election where candidate 1 receives the same number of votes as candidate 2, what is the chance that candidate 1 is never behind throughout the vote count?

The proof creates a 1:1 correspondence between your merged set of $2n$ items and paths in the $(x,y)$ plane from $(0,0)$ to $(n,n)$, as follows: Start at $(0,0)$. Examine the items in the merged set in descending order. When you see an item from set 1, move right one step. when you see an item from set 2, move up one step. There are $2n\choose n$ total paths.

Now call a path bad if the 'greater' event fails to occur, so a path is bad if and only if it ever touches the line $y=x+1$. By reflecting bad paths in the line $y=x+1$, argue that there is a 1:1 correspondence between bad paths and paths from $(-1,1)$ to $(n,n)$. Hence the number of bad paths is ${n-1 + n+1\choose n-1}={2n \choose n-1}$, the probability of a bad path is ${2n \choose n-1}/{2n\choose n}={n\over n+1}$, and the probability of a good path (i.e., the 'greater' event) is $\frac1{n+1}$.

The above argument is explained in more detail in the wikipedia article.