There is a list $X$ of $N$ different numbers, sorted in ascending order. First, we perform resampling with replacement, constructing a list $Y$. This means that we consider a uniform distribution over the elements of $X$, and sample from that distribution $N$ times, forming the list $Y$. Note that $Y$ will likely contain multiple copies of some elements of $X$, and some other elements will be absent.
- For example, if $X = [1,2,3,4,5,6]$, then $Y$ could be any combination of the above, for example $Y = [6,4,6,1,1,3]$
Question: What is the probability that the j-th smallest element of $X$ is also the i-th smallest element of $Y$.
- For example, if $Y$ is the one above and we are looking for the 2nd smallest element of $Y$ (i.e. $i = 2$), its value would be 1, which means that it is the 1st smallest element of $X$ (i.e. $j = 1$) in this particular random outcome for $Y$.
Note: If you find an analytic solution, that's cool. However, it is sufficient to find an algorithm that requires a finite amount of operations.
Edit: I have performed a simulation for N=20 doing 10000 resamples. It appears that the target probability is somewhat similar to a gaussian $e^{(i-j)^2}$, but not quite, as the standard deviation is wider in the center than at the edges. The function appears symmetric in $i$ and $j$, which does not seem trivial to me.
Here I also plot the log10 of the distribution to see the tails better. The white squares are NAN due to finite number of samples taken



In summary, we can formulate a recurrence relation for this problem and then solve it using dynamic programming. Suppose the length of $X$ is $n$, and the i-th smallest number in $X$ is denoted as $X_i$. We are asked to find the probability of $X_i$ being the j-th smallest element in $Y$.
Assume that we have already selected $(n-a)$ numbers into $Y$, and we still need to choose
amore numbers. In the numbers already selected, there are $y$ numbers strictly smaller than $X_i$ and $z$ numbers strictly greater than $X_i$.For the boundary conditions, if $y > j-1$ or $z > n - j$, then the probability must be zero. If $a = 0$, the probability is $1$ (because the selection process has been completed at this point). For the purpose of simplification in our recurrence, let's denote $b = j - 1 - y$ and $c = n - j - z$. You can consider b and c as some form of quotas. The probability of achieving the goal at this point is denoted as $d(a,b,c)$.
When selecting an element from X, there are three possibilities:
So, we have $d(a,b,c) = \frac{1}{n} \times d(a-1, b, c) + \frac{i-1}{n} \times d(a-1, b-1, c) + \frac{n-i}{n} \times d(a-1, b, c-1)$.
The answer we ultimately want to solve is equal to $d(n, j-1, n-j)$.
The python code implementation is as follows:
I have obtained results that are identical to your random experiment: prob.png
The closed-form solution to this problem should also exist, which I speculate is some form involving combinations. As for the symmetry of $i$ and $j$, it can also be proven using mathematical induction with the recurrence, and I leave this as an exercise here.
Here are additional proofs of the termination condition.
We can assure that when we reach the termination condition
a==0successfully, the following propositions must hold:b>=0 and c>=0. Otherwise, the program will enter the first branch (a < 0 or b < 0 or c < 0).