Difference of Ordered Uniform Random Variables

1.2k Views Asked by At

Let $X_1, X_2,..., X_n$ be $n$ random variables distributed uniform(0,1) and $X_{(1)},X_{(2)},..., X_{(n)}$ be the ordered statistics of $X_1,...,X_n$ such that:

$X_{(1)} < X_{(2)} < ... < X_{(n)}$

$X_{(1)} = min(X_1,...,X_n)$

$X_{(n)} = max(X_1,...,X_n)$

I know that these variables are distributed:

$X_{(i)} \sim Beta(i, n+1-i)$

I am looking to find the distribution of the difference of consecutive ordered statistics:

$Y_{i+1,i} = X_{(i+1)} - X_{(i)}$

in order to calculate the total probability:

$p = P(Y_{2,1} < d_{2,1} \cap Y_{3,2} < d_{3,2} \cap ... \cap Y_{n,n-1} < d_{n,n-1})$

Where $d_{i+1,i}$ are some given distances

This proof, Difference of order statistics in a sample of uniform random variables, suggests that the distribution of $Y_{i+1,i}$ is

$Y_{i+1,i} \sim Beta(1,n)$

This suggests that the events in the probability, $p$, above are independent... is this true?

2

There are 2 best solutions below

0
On

I believe your subscripts on the $Y$'s are backwards. Your individual distributions for the order statistics and their differences seem correct. However, your assertion about independence of the $Y$'s seems counter-intuitive to me, and does not turn out to be true in the simple simulation below (using R), for $n = 5$ and the 2nd, 3rd, and 4th order statistics. All four such differences in neighboring order statistics are constrained to add to the range of the five observations.

I will leave it to you to fix your notation, decide whether I correctly guessed your intentions, and investigate association between differences in order statistics.

 n = 5;  h = 2;  j = 3;  k = 4
 m = 10^4;  xh = xj = xk = numeric(m)
 for (i in 1:m) {
   x = sort(runif(n));  xh[i] = x[h];  xj[i] = x[j];  xk[i] = x[k] }

 ks.test(xh, "pbeta", h, n+1-h)
 ##  One-sample Kolmogorov-Smirnov test  (data:  xh) 
 ## D = 0.0098, p-value = 0.2905 # Consistent with Beta(2,4)
 ## alternative hypothesis: two.sided 

 ks.test(xj, "pbeta", j, n+1-j)
 ## One-sample Kolmogorov-Smirnov test  (data:  xj) 
 ## D = 0.0066, p-value = 0.7684  # Consistent with Beta(3,3)
 ## alternative hypothesis: two.sided 

 ks.test(xk, "pbeta", k, n+1-k)
 ## One-sample Kolmogorov-Smirnov test  (data:  xk)
 ## D = 0.0057, p-value = 0.902  # Consistent with Beta(4, 2)
 ## alternative hypothesis: two.sided 

 yhj = xj - xh;  yjk = xk - xj
 ks.test(yhj, "pbeta", 1, n)
 ## One-sample Kolmogorov-Smirnov test  (data:  yhj)
 ## D = 0.0095, p-value = 0.3302  # Consistent with Beta(1, 5)
 ## alternative hypothesis: two.sided 

 ks.test(yjk, "pbeta", 1, n)  
 ## One-sample Kolmogorov-Smirnov test  (data:  yjk)
 ## D = 0.0076, p-value = 0.6164  # Consistent with Beta(1, 5)
 ## alternative hypothesis: two.sided 

 cor(yhj, yjk)
 ## -0.199315  # NOT consistent with 0 correlation implied by independence

Note: The method of simulation is certainly not optimal for speed, but may be more transparent than an optimal one for those unfamiliar with R programming.

0
On

They can't be independent, since they sum up to $X_{(n)}-X_{(1)}\leq 1$. So if you know for example that $Y_{2,1}$ is 0.5, you get that $Y_{3,2}+Y_{4,3}+...+Y_{n,n-1}\leq 0.5$, which implies that $Y_{3,2}\leq 0.5$. This can't be a beta distribution.