Claim about Empirical Quantile Function: $\hat{F}_n^{-1}\left(\frac{i}{n+1}\right) = X_{(i)}$

138 Views Asked by At

The lecturer in a class I'm taking defined the empirical quantile function for a sample of $n$ random variables $\{X_i\}_{i = 1}^n$ as follows: $$ \hat{F}_n^{-1}(p) = \left\{\begin{aligned} &X_{(np)} &&, np \in \mathbb{N}\\ &X_{(\lfloor np+1 \rfloor)} &&, np \notin \mathbb{N}, \end{aligned} \right.$$

where $X_{(i)}$ represents the $i^{\text{th}}$ order statistic of the sample.

Based on this definition, I'm trying to understand the following claim: $$ \hat{F}_n^{-1}\left(\frac{i}{n+1}\right) = X_{(i)}. $$


My progress to this point:

  • I was able to show that: $n \in \mathbb{N}$ and $i \in [0,n] \cap \mathbb{N} \implies n+1 \nmid ni$. Hence, $\frac{ni}{n+1} \notin \mathbb{N}$ in this situation.
  • What then remains to show is that $i \leq \frac{ni}{n+1} + 1 < i+1$. The second inquality is clear, since $\frac{n}{n+1} < 1$.
  • Thus, showing this boils down to showing: $i \leq \frac{n}{n+1}\cdot i + 1$ when $n \in \mathbb{N}$, $i \in [1,n]$.
2

There are 2 best solutions below

0
On BEST ANSWER

To prove the third bulletpoint, suppose for purpose of establishing a contradiction that for $k \in \{1,...,n\}$:

\begin{equation} \begin{split} k & > \frac{nk}{n+1} + 1\\ \frac{n+1}{n+1}k & > \frac{nk}{n+1} + \frac{n+1}{n+1}\\ nk + k & > nk + n + 1\\ k & > n + 1 \end{split} \end{equation}

Which by assumption is not true.

1
On

Maybe an example in R will help you visualize the ECDF function.

Here is a sorted normal sample rounded to two places:

x = sort( round(rnorm(10, 20, 3),2) );  x
[1] 16.38 18.07 18.23 18.37 19.31 20.29 20.55 20.68 22.20 26.29

The fifth order statistic is 19.31:

x[5]
[1] 19.31

The x-values of the ECDF are the order statistics. Its y-values are shown below:

F = (1:10)/11;  F
 [1] 0.09090909 0.18181818 0.27272727 0.36363636 0.45454545 0.54545455
 [7] 0.63636364 0.72727273 0.81818182 0.90909091

The fifth y-value of the ECDF is $5/11.$

F[5]; 5/11
[1] 0.4545455
[1] 0.4545455

Here is a plot of this ECDF, in which the x- and y-values mentioned above are emphasized:

plot(x, F, type="s", lwd=2, ylim=c(0,1))
  abline(h=0:1, col="green2")
  points(x, F, pch=19)
  abline(v = x[5], col="red", lty="dotted")
  abline(h = F[5], col="blue", lty="dotted")

enter image description here

Some authors say that the ECDF function consists only of the heavy dots, and some say that the horizontal lines are also part of the function. (If included in a plot, the vertical lines at 'jump points' or 'knots' are just to help the eye follow the function; sometimes they are dotted lines.)

You should know that different texts and statistical software programs (R among them) define the ECDF in various slightly (but fundamentally) different ways. Here is a plot of the ECDF from R.

plot(ecdf(x))

enter image description here