My book says the following :
"Let $X_{(1)}, X_{(2)}, ..., X_{(n)}$ be a set of values ordered in ascending order ($X_{(1)} \leq X_{(2)} \leq ... \leq X_{(n)})$. For a given $p$ ($0 \le p \le 1$), the $pth$ sample quantile $q_p$ is a value that has a proportion $p$ of the sample taking values smaller than it and a proportion $1-p$ taking values larger than it."
It then says that the value of this quantile is $X_{(1+(n-1)p)}$.
My question : why does the $pth$ quantile is at the index $1+(n-1)p$ of the ordered sample? Is there an intuitive way to understand this?
It is often difficult to explain rationales for formulas involving quantiles.
There are many slightly different ways to define quantile. For small samples of integer data, the variations in results from one definition to another can be very noticeable.
Different textbooks and software use different definitions. R statistical software gives rigorous accounts of nine different definitions on the help screen that is accessible from the Console by typing
? quantile. Here is a brief demonstration that shows a few of the differences.The vector
xis a sorted sample of size 20 from $Binom(50, 1/3).$The statement
quantile(x)without further arguments gives percentiles 0, 25, 50, 75, and 100, based on the 7th (default) of these definitions. Other definitions can be invoked with arguments such astype=1, and so on.And so on. Exact discrepancies among definitions depend on the particular random sample generated.