The probability involving the ordered normal sample.

100 Views Asked by At

Let $X_1,X_2,X_3,X_4,X_5$ be a normal sample, taken from the distribution with unknown mean $\mu$ and known variance $\sigma^2$. Calculate $$P(|X_{3:5}-\mu|<0.841\cdot\sigma).$$

Comment. If the order statistics $X_{3:5}$ were replaced with the mean, it would be the standard exercise. Now it seems that some more information about the order statistics is necesary - but I have no idea, what is needed here. Could you give some advice?

2

There are 2 best solutions below

1
On BEST ANSWER

The general formula for the distribution of order statistics from an iid sample is:

$$F_{X_{(k)}}(x)=\sum_{i=k}^n{n \choose i}\left[F(x)\right]^i\left[1-F(x)\right]^{n-i}$$

Substituting $k=3,n=5,z=\frac{x-\mu}{\sigma}\;\text{ and }F(z)=\Phi\left(z\right)$ we get:

$$F_{Z_{(3)}}(z)=\sum_{i=3}^5{5 \choose i}\left[\Phi\left(z\right)\right]^i\left[1-\Phi\left(z\right)\right]^{5-i}$$

Where $Z_{(k)}$ is an order statistic for a sample of standard normal random variables.

Now, lets rearrange your original probability statement:

$$P(|X_{(3)}-\mu|<0.841\cdot\sigma)=P\left(\frac{|X_{(3)}-\mu|}{\sigma}<0.841\right)=P(|Z_{(3)}|<0.841)$$

So, your problem is equivalent to calculating the value of $F_{Z_{(3)}}(0.841)-F_{Z_{(3)}}(-0.841)$

$$F_{Z_{(3)}}(0.841)\approx \sum_{i=k}^n {5\choose i}\left[0.8\right]^i\left[0.2\right]^{5-i}=0.94208$$ $$F_{Z_{(3)}}(-0.841) \approx \sum_{i=k}^n {5\choose i}\left[0.2\right]^i\left[0.8\right]^{5-i}=0.05792$$

Therefore:

$$P(|X_{(3)}-\mu|<0.841\cdot\sigma) =P(|Z_{(3)}|<0.841) \approx 0.94208-0.05792 = 88.416\%$$

1
On

I know this isn't a proof or anywhere near rigorous, but according to Mathematica, the PDF of the median of an IID sample of size $5$ drawn from a normal distribution with mean $\mu$ and standard deviation $\sigma$ is $$f_{X_{(3)}}(x) = \frac{15}{8 \sqrt{2\pi} \sigma} e^{-(x-\mu)^2/(2\sigma^2)} \left(1 - \Phi^2\left(\frac{x-\mu}{\sigma \sqrt{2}}\right)\right)^{\!2},$$ where $\Phi(z) = \Pr[Z \le z]$ is the CDF of a standard normal random variable. Then the probability $$\Pr[\mu - k\sigma \sqrt{2} < X_{(3)} < \mu + k\sigma \sqrt{2}] = \frac{1}{8}\Phi(k)\left(15 - 10 \Phi(k)^2 + 3 \Phi(k)^4\right),$$ where $k > 0$ is some constant. In your case, $k \approx 0.594677$ which gives a probability of about $0.883893$.

Relevant Mathematica code:

PDF[OrderDistribution[{NormalDistribution[m, s], 5}, 3], x]

Integrate[%, {x, m - k s Sqrt[2], m + k s Sqrt[2]},
          Assumptions -> {Element[m, Reals], s > 0, k > 0}]

F[m_, s_, k_] :=  m - k s Sqrt[2] < Median[RandomVariate[
                  NormalDistribution[m, s], 5]] < m + k s Sqrt[2]

Tally[Parallelize[Table[F[Random[], 1, .594677], {10^6}]]]

The last two commands does a simulation of $10^6$ trials, creating a sample of $5$ normally distributed random variables with standard deviation $1$ and mean $\mu$ drawn from a uniform $(0,1)$ distribution, then taking the median value and testing if it satisfies the inequality. When I ran it, I got $884026$, which is very close to the theoretical probability.