Distributions of differences after sorting and pairing

38 Views Asked by At

Question

Suppose we sample $2n$ values $(x_i)_{i=1}^{2n}$ from some distribution on $\mathbb{R}$ (say the $x_i$ are i.i.d. normal). We then sort the values $x_{\pi(1)}\leq\ldots\leq x_{\pi(2n)}$ and consider the differences $d_k=x_{\pi(n+k)}-x_{\pi(k)}$, $1\leq k\leq n$.

Numerical example ($n=4$): The sequence 3, 5, 3, 1, 7, 2, 4, 4 gets sorted to 1, 2, 3, 3, 4, 4, 5, 7 and the differences are $4-1$, $4-2$, $5-3$, $7-3$ or 3, 2, 2, 4.

input 3 5 3 1 7 2 4 4
sorted 1 2 3 3 4 4 5 7
top half bottom half difference
4 1 3
4 2 2
5 3 2
7 3 4

What are the distributions of the differences $d_k$? I'm not sure how to approach this without resorting to simulation. Is there anything like a closed-form solution or expected values or anything worth saying (say for i.i.d. normal $x_i$)?

Experiments

[Uniform $[0,1]$] Here are 10 experiments superimposed, each the average of 100,000 trials for $n=128$ and $x_i$ uniform on $[0,1]$, i.e. each curve is a simulation of $f(k)=\mathbb{E}(d_k)$. This agrees with the theoretical result in the next section.

enter image description here

[$\mathcal{N}(0,1)$] Here are 10 experiments superimposed, each the average of 100,000 trials for $n=128$ and $x_i$ standard normal, i.e. each curve is a simulation of $f(k)=\mathbb{E}(d_k)$. This is kind of what I expected, a U-shaped curve.

enter image description here

Order statistics for the uniform distribution

Wikipedia has some information about order statistics for the uniform distribution on $[0,1]$. From this we find that the expectation of the differences is constant, $$ \mathbb{E}(d_k)=\frac{n}{2n+1}, $$ which agrees with the experiment above ($128/257=0.498...$).