Why should we take intersection in this order statistics problem

96 Views Asked by At

Let $X_1,\dots,X_n$ be a sample of i.i.d RVs, $X_j\sim F$. Denote by $X_{(1)}\le X_{(2)}\le\dots\le X_{(n)}$ the order statistics for the sample. Find the DF of $X_{(1)}$ and $X_{(n)}$.

My take:
$\mathbb{P}(X_{(1)}\le t)=1-\mathbb{P}(X_{(1)}>t)=1-(1-F(t))$. But the solution says $1-(1-F(t))^n$. I suspect they computed as $1-\mathbb{P}(\bigcap_{j=1}^n \{X_{(j)}>t\})$. Why should we take the intersection? Isn't $X_{(1)}\le X_{(2)}\le\dots\le X_{(n)}$, hence if $X_{(1)}>t$ then it follows $X_{(2)},\dots,X_{(n)}>t$?

Similarly, why should we bother taking the intersection: $\mathbb{P}(X_{(n)}\le t)=\mathbb{P}(\bigcap_{j=1}^n \{X_{(j)}\le t\})$? Since $X_{(n)}\le t$ implies $X_{(1)},\dots,X_{(n-1)}\le t$.

I was confused and couldn't find any clue. If anyone could help me explain this, it will be greatly appreciated. Thanks!

1

There are 1 best solutions below

2
On BEST ANSWER

The minimum of $X_1,\ldots,X_n$ is more than $t$ precisely if all of $X_1,\ldots,X_n$ are more than $t$. To say that you're at a point where all of several events happen is to say that your at a point in the intersection of the separate sets where those various events happen.

It is true that if $X_{(1)}>t$ then the other order statistics are greater than $t$.

But it is not true that if one of the original observations are greater than $t$, then all of them are greater than $t$. It is those original unsorted observations whose cumulative distribution function you know, and it is also those observations that are independent.

A concrete example: Suppose $n=2$ and $\displaystyle X_1 = \begin{cases} 1 & \text{with probability }1/2, \\ 2 & \text{with probability }1/2. \end{cases}$

Then we have \begin{align} \Pr(X_1=X_2=1) & = 1/4 \\ \Pr(X_1=1\ \&\ X_2=2) & = 1/4 \\ \Pr(X_1=2\ \&\ X_2=1) & = 1/4 \\ \Pr(X_1=X_2=2) & = 1/4 \end{align}

So what are $\Pr(X_{(1)}=1)$ and $\Pr(X_{(1)}=2)$?

$\Pr(X_{(1)}\ge1) = \Pr(\text{both}\ge1)=1$.

$\Pr(X_{(1)}\ge2) = \Pr(\text{both}\ge 2)=1/4$.

Certainly it is true that if $X_{(1)}\ge\text{something}$, then $X_{(2)}\ge\text{that same thing}$. But it is not true that if $X_1\ge\text{something}$ then $X_2\ge\text{that same thing}$, so if you want the probability that they're both $\ge$ something, you need to consider both.