I have a set of elements $\{x_1, x_2,...,x_n\}$ having normal distribution.
Now, I want to choose $K$ random pairs $(x_i, x_j)$ and compute their difference $\Delta x = |x_i-x_j|$.
How can I compute the probability of getting at least $M$ pairs having $\Delta x > D$, with D given?
I'll explain what I did, and why I think it's wrong:
Let be $\Delta$ the set of all possible pairs pairs $(x_i, x_j)$, then $|\Delta| = {n\choose 2} = \frac{n*(n-1)}{2}$.
Let $Z$ be the random variable given by $\Delta x$, I assumed that $Z$ also has a normal distribution. Therefore, it is possible to compute it's median and standard deviation. Having those parameters, it's just a matter of computing the probability $P(Z>D)$.
Now, because I need to find the probability of getting at least M pairs, I might use another random variable ($Y$) which will count the number of "successes" I find when randomly extracting $K$ elements from $\Delta$.
I could use the Binomial Distribution $Bin(k,p)$, with $k$ being the number of successes, $p = P(Z>D)$, and $K$ being the number of experiments. I could apply the binomial formula with $k$ from zero to M (assuming M low, it's easier to compute for the complement).
The problem I find with this approach is that I don't think there's independence between the experiments. For instance, say I choose a pair $(x_i, x_j)$, it's clear that I won't be able to choose the pair $(x_j, x_z)$ at a later time.
I am thinking I could use Hypergeometric Distribution instead of Binomial, but I'm still not convinced.
Any thought on this?
Thanks
The solution to this problem was simpler than I thought,I just had to pay more attention to the bibliography.
If $Y_1$ and $Y_2$ are two random variables for the elements above, then both random variables are independent.
Let $Z = Y_1 - Y_2$, then $Z \sim \mathcal{N}(0, 2\sigma^2)$. Therefore, for any number $d$:
$\begin{align} P(|Z|>d) &= P(Z <-d) + P(Z>d)\\ &= P(Z < -d) + 1 - P(Z<=d)\\ &= \Phi(\frac{-d}{\sigma}) + 1 - \Phi(\frac{d}{\sigma}) & , assuming\space P(Z=d) = 0\\ &= 2 * (1 - \Phi(\frac{d}{\sigma}))\\ \end{align}$
With this probability I can now apply the Binomial process described in the question.