Quantile of Empirical CDF and Tail Bound

157 Views Asked by At

Let $F$ be the CDF of $X$ and $p \in (0, 1)$, and $F_n$ be the empirical CDF of $X_1, ..., X_n$; $F_n(x) = \frac{1}{n}\sum_{i = 1}^nI(X_i \le x)$. The $p$ th quantile of $F$ and $F_n$ are defined as follows; $$\zeta_p := \text{inf}\{x \in \mathbb R : F(x) \ge p\} \quad\text{and}\quad \hat\zeta_p := \{x \in \mathbb R : F_n(x) \ge p\}$$ Then, prove the following inequality (for all $t > 0$); $$\mathbb P(|\hat\zeta_p -\zeta_p| > t) \le 2\text{exp}\{-2n\epsilon_t^2\},$$ where $\epsilon_t = \text{min}(F(\zeta_p + t) − p, p − F(\zeta_p − t))$.

This is what I'm trying to solve.

We have an inequality regarding the empirical CDF and Hoeffding's bound which is similar to the desired result; $$\mathbb P(|F_n(x) - F(x)| > t) \le 2\text{exp}\{-2nt^2\}\quad \forall t > 0$$ But, I'm not sure whether I can apply this inequality to the desired result, and (if possible) how to do so.

What makes me confusing is that $F(\hat\zeta_p -\zeta_p)$ doesn't seem to make some useful result.

Also, I found it difficult to find the relationship between the definition of $\epsilon_t$ and the desired inequality.

Any hint or approach with respect to this problem would be grateful. Thank you.

1

There are 1 best solutions below

0
On BEST ANSWER

It is easy to verify that $$\mathbb{P}(F_n(x)-F(x) < -t) \leq \exp(-2nt^2) \; \forall t>0$$ This is basically the one-sided version of the inequality you mentioned in your question (the factor of $2$ comes from the union bound)

Let's work with the probability $\mathbb{P}(\hat \zeta_p > \zeta_p+t)$. Notice that $\hat \zeta_p > \zeta_p + t$ is equivalent to $F_n(\zeta_p+t) < p$. With that in mind, we can write $$\mathbb{P}\big\{\hat \zeta_n > \zeta_n+t\big\} = \mathbb{P}\big\{F_n(\zeta_p+t) < p\big\}=\mathbb{P}\big\{F_n(\zeta_p+t) - F(\zeta_p+t) < -(F(\zeta_n+t)-p)\big\} \leq e^{-2n(F(\zeta_p +t)-p)^2}$$

Repeating the same arguments for the probability $\mathbb{P}\big\{\hat \zeta_n < \zeta_p-t\big\}$ and applying the union bound we get the required inequality.