Where does (n+1) come from when finding quantiles

352 Views Asked by At

So many textbooks suggest a similar formula like the following for finding quantiles $q$ is the quantile, $p$ a number between 0 and 1 and $n$ the total number of our samples:

$q(p) = x_{(k)} + \alpha(x_{(k+1)} - x_{(k)})$,

where $k = [p(n+1)]$ and $\alpha = p(n+1) - [p(n+1)]$.

In which I don't understand the reasoning behind $(n+1)$ in $k = [p(n+1)]$ and subsequently other statements.

If we're going for the median, $n+1$ would make sense considering $m-1=n-m$, but shouldn't this differ now that we're technically multiplying one side?

1

There are 1 best solutions below

1
On BEST ANSWER

If I understand the notation correctly, the idea is that the $n$ samples partition all possible values of $X$ into $(n+1)$ equally probable ranges:

  • range $0$: $X < x_1$;
  • range $1$: $x_1 < X < x_2$;
  • range $2$: $x_2 < X < x_3$; $\ldots$
  • range $(n-1)$: $x_{n-1} < X < x_n$;
  • range $n$: $x_n < X$,

and some boundary handling when $X \in \{x_i\}$. That's where the $(n+1)$ originates from.

If $p(n+1)$ is an integer and $1 \le p(n+1) \le n$, the $p$-quantile is the start of range $k = p(n+1)$. So for example, when $p = \frac 2{n+1}$, the $\frac 2{n+1}$-quantile is the start of range $2$, i.e. $x_2$, and two of the $(n+1)$ ranges are before $x_2$.

If $p(n+1)$ is not an integer but still $1 < p(n+1) < n$, the formula of $q(p)$ considers inside range $k = \lfloor p(n+1)\rfloor$, and just performs linear interpolation between the two ends $x_{\lfloor p(n+1)\rfloor}$ and $x_{\lceil p(n+1)\rceil}$.