Sample quantile of order p

511 Views Asked by At

My book says the following :

"Let $X_{(1)}, X_{(2)}, ..., X_{(n)}$ be a set of values ordered in ascending order ($X_{(1)} \leq X_{(2)} \leq ... \leq X_{(n)})$. For a given $p$ ($0 \le p \le 1$), the $pth$ sample quantile $q_p$ is a value that has a proportion $p$ of the sample taking values smaller than it and a proportion $1-p$ taking values larger than it."

It then says that the value of this quantile is $X_{(1+(n-1)p)}$.

My question : why does the $pth$ quantile is at the index $1+(n-1)p$ of the ordered sample? Is there an intuitive way to understand this?

1

There are 1 best solutions below

2
On

It is often difficult to explain rationales for formulas involving quantiles.

There are many slightly different ways to define quantile. For small samples of integer data, the variations in results from one definition to another can be very noticeable.

Different textbooks and software use different definitions. R statistical software gives rigorous accounts of nine different definitions on the help screen that is accessible from the Console by typing ? quantile. Here is a brief demonstration that shows a few of the differences.

The vector x is a sorted sample of size 20 from $Binom(50, 1/3).$

x
## 12 13 13 13 14 15 15 15 15 16 16 16 16 16 17 17 18 19 19 21

The statement quantile(x) without further arguments gives percentiles 0, 25, 50, 75, and 100, based on the 7th (default) of these definitions. Other definitions can be invoked with arguments such as type=1, and so on.

quantile(x, type=1)
  0%  25%  50%  75% 100% 
  12   14   16   17   21 
quantile(x, type=2)
  0%  25%  50%  75% 100% 
12.0 14.5 16.0 17.0 21.0 
quantile(x, type=3)
  0%  25%  50%  75% 100% 
  12   14   16   17   21 
quantile(x, type=4)
  0%  25%  50%  75% 100% 
  12   14   16   17   21 
quantile(x, type=5)
  0%  25%  50%  75% 100% 
12.0 14.5 16.0 17.0 21.0 

And so on. Exact discrepancies among definitions depend on the particular random sample generated.