Finding the correlation coefficient of ordered statistics

2.3k Views Asked by At

I am working on the following problem.

Let $$X_{(1)}, \ldots ,X_{(n)}$$ be the order statistics from the uniform distribution of $[0,1]$. Find the coefficient correlation of $X_{(1)}$ and $X_{(n)}$.

So, here are the things that I understand.

1), I know how to find the pdf for the kth order statistics and it is $nf(x)_{n-1}C_{k-1}[F(x)]^{k-1}[1-F(x)]^{n-k}$

2), As a matter of fact, $X_{(1)}$ and $X_{(n)}$ are simply the case where we have the minimum and the maximum of $X_{i}$s.

3), What we are solving for is $$\frac{Cov[X_{(1)},X_{(n)}]}{\sigma_{X1}\sigma_{Xn}}$$ and I am familiar with how to find each numbers.

So here is my question.

I encountered this problem as one of the practice actuarial exam. Is this a typical question? It seems unusually time consuming.

Trying to answer this question I looked at the solution and it somehow used the relationship $E[X_{(1)}]=1-E[X_{(n)}]$. It did seem like this saves a lot of time, but I did not see how this is true and what the motivation was.

Can someone help me out?

2

There are 2 best solutions below

0
On BEST ANSWER

The identity $E[X_{(1)}]=1-E[X_{(n)}]$ can be upgraded to the fact that $X_{(1)}$ and $1-X_{(n)}$ are identically distributed since defining $Y_k=1-X_k$ yields a sample $(Y_k)_k$ distributed as $(X_k)_k$ and such that $Y_{(1)}=1-X_{(n)}$ and $Y_{(n)}=1-X_{(1)}$.

To compute expectations of functions of $(X_{(1)},X_{(n)})$, the easiest approach might be to note that, for every $0\lt x\lt y\lt1$, $$ P(x\lt X_{(1)},X_{(n)}\lt y)=P(\forall k\leqslant n,x\lt X_k\lt y)=(y-x)^n, $$ hence, differentiating this identity twice, one sees that $(X_{(1)},X_{(n)})$ has density $$ f_n(x,y)=n(n-1)(y-x)^{n-2}\,\mathbf 1_{0\lt x\lt y\lt1}. $$ Furthermore, the fact that every $f_n$ is a density yields, for every $i$, $$ E((X_{(n)}-X_{(1)})^i)=\iint(y-x)^if_n(x,y)\mathrm dx\mathrm dy=\frac{n(n-1)}{(n+i)(n+i-1)}\iint f_{n+i}, $$ hence $$ E((X_{(n)}-X_{(1)})^i)=\frac{n(n-1)}{(n+i)(n+i-1)}. $$ This, together with the identity in distribution of $X_{(n)}$ and $1-X_{(1)}$, allows to simplify the computations. For example, the system of two equations $$ E(X_{(n)})=1-E(X_{(1)}),\qquad E(X_{(n)}-X_{(1)})=\frac{n-1}{n+1}, $$ yields $$ E(X_{(1)})=\frac1{n+1},\qquad E(X_{(n)})=\frac{n}{n+1}. $$ Likewise, $$ E((X_{(n)}-X_{(1)})^2)=\frac{n(n-1)}{(n+2)(n+1)}=E(X_{(n)}^2)-2E(X_{(n)}X_{(1)})+E(X_{(1)}^2), $$ hence, computing the densities $u_n$ and $v_n$ of $X_{(1)}$ and $X_{(n)}$ as $$ u_n(x)=n(1-x)^{n-1}\mathbf 1_{0\lt x\lt1},\qquad v_n(y)=ny^{n-1}\mathbf 1_{0\lt y\lt1}, $$ and using them to compute $$ E(X_{(n)}^i)=\frac{n}{n+i}, $$ one can deduce $E(X_{(1)}^2)=E((1-X_{(n)})^2)$ and $E(X_{(1)}X_{(n)})$, hence all the variances and covariances are deduced.

0
On

Are you allowed to use a computer during the exam? If so, it only takes about 20 seconds to solve. In particular, you are given random variable $X \sim Uniform(0,1)$ with pdf $f(x)$:


(source: tri.org.au)

Then, the joint pdf of the sample minimum $X_1$ and the sample maximum $X_n$, say $g(x_1,x_n)$, is:


(source: tri.org.au)

where I am using the OrderStat function in the mathStatica add-on to Mathematica to automate the nitty-gritties. Then, $Corr(X_1, X_n)$ is simply:


(source: tri.org.au)

All done.

Notes

  1. As disclosure, I should perhaps add that I am one of the authors of the software used above.