A random vector of continuous random variables and its Order Statistics

63 Views Asked by At

I am reading a book Introduction to Probability by Joe Blitzstein, Jessica Hwang. I was going though a section on Order Statistics, which I have mentioned below.

Let $X_1, X_2, \cdots, X_n$ be i.i.d continuous r.v.s with CDF $F$ and PDF $f$. Suppose $Y_1, Y_2, \cdots, Y_n$ be the order statistics of $Y_1, Y_2, \cdots, Y_n$. We can find the PDF $f_{Y_j}$ of the $j$th order statistic $Y_j$ as follows.

Consider $f_{Y_j}(x) \mathrm{d} x$, which is the probability that $Y_j$ falls into an infinitesimal interval of length $\mathrm{d} x$ around $x$. Thus we need one of $X_i$ to fall into the infinitesimal interval around $x$, and we need exactly $j - 1$ of the $X_i$ to fall to the left of $x$, leaving the remaining $n - j$ to fall to the right of $x$. We can do this as follows. First, we choose which of the $X_i$ will fall into the infinitesimal interval around $x$. There are $n$ such choices, each of which occurs with a probability $f(x) \mathrm{d} x$. Next we choose exactly $j - 1$ of the remaining $n- 1$ to fall to the right of $x$. There are $\binom{n - 1} {j - 1}$ such choices, each with probability $F(x) ^ {j - 1} (1 - F(x) ) ^ {n - j}$. We therefore have \begin{equation*} f_{Y_j}(x) \mathrm{d}x = n f(x) \mathrm{d} x \binom{n - 1} {j - 1} F(x) ^ {j - 1} (1 - F(x) ) ^ {n - j} \end{equation*} , and hence \begin{equation*} f_{Y_j}(x) = n f(x) \binom{n - 1} {j - 1} F(x) ^ {j - 1} (1 - F(x) ) ^ {n - j} \end{equation*}


I have some questions regarding the above argument. It is given that to find the probability that $Y_j$ falls into an infinitesimal interval of length $\mathrm{d} x$ around $x$, we need to find the probability that one of $X_i$ to fall into the infinitesimal interval around $x$, and we need exactly $j - 1$ of the $X_i$ to fall to the left of $x$, leaving the remaining $n - j$ to fall to the right of $x$.

However, I do not 'clearly' understand why only one of the $X_i$ must fall into an interval of length $\mathrm{d} x$ around $x$ even though $Y_1 \le \cdots \le Y_n$ and the equality 'could' hold true. My understanding is that since $X_1, X_2, \cdots, X_n$ are i.i.d continuous r.v.s the probability of a 'tie' is zero, i.e., $P(X_m = X_n) = 0$ for all $m \neq n$. Now if two of $X_i$s falls into into an infinitesimal interval of length $\mathrm{d} x$ around $x$ then the probability would be $\left( f(x)\mathrm{d} x \right) ^ 2$ which could be non-zero, and this contradicts the fact that the probability of a tie is zero. Is my reasoning correct? If not, what could be the reason to say that one of $X_i$ must fall into an interval of length $\mathrm{d} x$ around $x$?