Problem definition:
D: a size n set of real numbers
M: a size m set of real numbers sampled from D without replacement,
each number in D has equal probability to appear in M.$m \leq n$
I want to estimate percentile $q \in [0,1]$ for n. That is: I want a number x such that there are $qn$ numbers in D less than x. Since n is too large, I will use quantile in M to approximate quantile in D.
I want to estimate the expected value of m, such that I have $1-\delta$ probability of having my estimation y with error $|\hat{q} - q| \leq \epsilon$. Where $\hat{q}$ is the ground truth quantile of y in D.
Some paper say m is at least $\Theta(\frac{1}{\epsilon^2}log(\frac{1}{\delta}))$, can anybody help me to prove that?
Example:
D: [1,2,3,4, ... , 100]
n = 100
M: [2, 70, 90]
m = 3
q = 0.5(this means I want the median of D)
x = 50
y = 70
$\hat{q} = 0.7$
$|\hat{q} - q| = 0.2$