This question came up a little while ago but unfortunately was put on hold. However, I found it intriguing as I had never come across a question like this before. There are $2$ groups of $30$ people randomly selected from $2$ normally distributed populations with identical standard deviations but their means for a particular characteristic are $4$ standard deviations apart. What is the probability of an overlap of the $2$ persons with the highest and lowest statistic from the lowest and highest groups respectively?
My Attempt: The area of overlap for the two distributions is $.0456$. This is twice the area of a single tail either greater or lower than $2$ SDs from the mean. Additionally, the probability of getting at least one person from a group in the overlap zone is $^{30}C_1\cdot.9544^{29}\cdot .0456 + ^{30}C_2\cdot .9544^{28}\cdot .0456^2.............^{30}C_7\cdot .9544^{23}\cdot .0456^7$. The series reduces to levels of insignificance beyond the $7$th term.
The probability has to consider the different number combinations of people in the overlap zone as well as the probability of overlap in the zone. Limiting this consideration to $1$ to $7$ people from each group, these combinations are $1,1; 1,2; 2,1; 2,2; 3,1; 1,3; 3,2; 2,3; 3,3........7,7$. For the probability of overlap with $1,1$ in the overlap zone, we have equal probability of order, AB or BA where BA is an overlap so this probability is $\frac{1}{2}$. For $1,2$ we have equal outcomes of A,B,B; B,A,B and B,B,A so the probability of overlap is $\frac{2}{3}$. Generally, this part of the probability calculation is $$\frac{(n_1+n_2)!-1}{n_1!\cdot n_2!}$$ So, the probability of overlap in the overlap zone containing $4$ from group A and $5$ from group B would be:
$$\frac{(4+5)!-1}{4!\cdot 5!}= \frac{125}{126}$$
Putting this all together I get: $$P(A,B \ \text{overlap} \ge 1) = \frac{1}{2}(30\cdot .9544^{29}\cdot .0456)^2 + 2\cdot \frac{2}{3}(30\cdot .9544^{29}\cdot .0456)(^{30}C_2\cdot .9544^{28}\cdot .0456^2) + \frac{5}{6}(^{30}C_2\cdot .9544^{28}\cdot .0456^2)^2 + 2\cdot \frac{3}{4}(30\cdot .9544^{29}\cdot .0456)(^{30}C_3\cdot .9544^{27}\cdot .0456^3) + 2\cdot \frac{9}{10}(^{30}C_2\cdot .9544^{28}\cdot .0456^2)(^{30}C_3\cdot .9544^{27}\cdot .0456^3) + \frac{19}{20}(^{30}C_3\cdot .9544^{27}\cdot .0456^3)^2 + ............ \frac{3431}{3432}(^{30}C_7\cdot .9544^{23}\cdot .0456^7)^2 = .4044$$
Does anyone want to comment on the method or correctness or have a simpler method for doing this calculation?
I did look at a $(1 - p)$ type solution of not being in the overlap zone or only A's or only B's in the zone plus not overlapping when both A's and B's were in the zone but this turned out to be just as long as the method I used.
I think I can see the flaw in my reasoning. An overlap of members between group A and B isn't limited to the region of overlap of distribution curves. That is, an A member can be outside the overlap zone but still be further to the right than a B member in the overlap zone.



I tried a Monte Carlo simulation, with the result that the probability of an overlap is about $0.523$.
First, here is my version of the problem statement. We have $X_1, X_2, X_3, \dots , X_{30}$ and $Y_1, Y_2, X_3, \dots , Y_{30}$, where each $X_i$ is drawn independently from a Normal distribution with mean $-2$ and standard deviation $1$, and each $Y_i$ is drawn independently from a Normal distribution with mean $2$ and standard deviation $1$. The difference of the means is therefore $4$ standard deviations. We would like to know the probability that the maximum of $X_1, X_2, X_3, \dots , X_{30}$ is greater than the minumum of $Y_1, Y_2, X_3, \dots , Y_{30}$.
One way to estimate the desired probability is to simulate many trials, using a pseudo-random number generator to generate the Normal variables. When I ran $10^6$ trials, the result was that the max $X$ was greater than the min $Y$ in $523,460$ cases, so the estimated probability is about $0.523$. A $95\%$ confidence interval for the probability is $0.5225$ to $0.5244$.
I used R for the simulation. The purpose of the set.seed statement at the start is to make the results reproducible, so anyone running the same code should get exactly the same results.