Say I have a set $S = \{x_1, \dots, x_m\}$, where $1 \le x_i \le n$, all distinct, with median $M$. I take a sample $T$ of size $t$ from $S$, with replacement. I call $Y$ the median of $T$. What is the probability distribution of $Y$?
Specifically, I'm trying to argue that $t$ needs to be large (in particular, not in $o(n)$) for $P[M - \frac{n}{10} \le Y \le M + \frac{n}{10}] \ge \frac{2}{3}$ (i.e. for it to be a 10% approximation to the value of the median). This leads me to wonder about $F_Y(y) = P[Y \le y]$. Full disclosure: This last bit is my homework, so you don't need to tell me the answer to that :p
Intuitively, $Y$ will be with high probability something whose rank is near $\frac{m}{2}$, but its value can be arbitrarily far off from $M$, I'd think. Experimentally, with $m = n = t$, the probability above is smaller than $\frac{1}{2}$ for $n \in \{1, \dots, 8\}$. I tried to show this by running the above algorithm on a given set $S$, then modifying $S$ by a monotonic increasing function, arguing that the result should be the same if I run the algorithm again, and seeing that the actual value $Y$ can now be much more spread out than before (thus not meeting the bound), but handling $Y$ is complicated.
I found this paper which explicitly gives the CDF for $Y$ on page 23, but it seems awfully hairy. Could it be simplified for the case $n = m = t$, perhaps?
For simplicity let's suppose $n=m=t = 2k-1$ is odd, so if $x_1 < \ldots < x_m$, the median $M$ is $x_k$. Your sample median $Y \le x_j$ iff at least $k$ of your sample values are $\le x_j$. The number of values $\le x_j$ in your sample is a binomial random variable with parameters $n$ and $p = j/n$, i.e. $$ P\left[Y \le x_j\right] = \sum_{i=k}^n {n \choose i} p^i (1 - p)^{n-i} $$ Using the normal approximation to the binomial, $$ P\left[Y \le x_j \right] \approx 1 - \Phi \left(\dfrac{k - j}{\sqrt{j - j^2/n}}\right)$$ where $\Phi$ is the standard normal CDF.