Tail probability of sum of order statistics of distance from point to a set

76 Views Asked by At

Let $P$ be a distribution on a metric space $(\mathcal X, d)$. For a point $x \in \mathcal X$ and a Borel $B \subseteq \mathcal X$, let $d(x,B) := \inf_{y \in B}d(x,y)$ be the distance of $x$ from $B$. Finally let $x_1,\ldots,x_N$ be an iid sample from $P$ and let $d(x_{(k)}, B)$ be the $k$th order $k$ smallest element of set $\{d(x_1,B),d(x_2,B),\ldots,d(x_N,B)\}$, and for $M \le N$, define $$S_M := \dfrac{1}{N}\sum_{i=1}^M d(x_{(i)},B). $$

By this post, it follows that if $t \mapsto F(t) := P(d(x_1,B) \le t)$ is the common distribution function of the $x_i$'s, then the distribution function of $d(x_{(k)},B)$ is given byt $$F_k(t) = P(d(x_{(k)}, B) \le t/N) = \sum_{i=0}^{k-1} {N \choose k} F(t)^k (1-F(t))^{N-i} $$ and expected value of $S_M$ is

$$ \mathbb E_P [S_M] = \frac{1}{N}\sum_{k=1}^M\int_0^\infty F_k(t)dt = \dfrac{1}{N}\int_{0}^\infty \sum_{k=1}^{M} (M-k){N \choose k}F(t)^k (1-F(t))^{N-k} dt $$

Question

What can be said about the tail probabilities $P(S_M > t)$ ?

N.B.: I'm fine with assuming that $P$ satisfies a transportation-cost inequality for the Wasserstein distances $W_1$ or $W_2$.