Confidence coefficient

59 Views Asked by At

Suppose that $W$ is a random variable of the continuous type. A random sample of size $n = 8$ of W is taken. The sample was: $10.86, 8.33, 7.8, 13.21, 9.04, 7.63, 11.02, 11.32$. (i) Determine the confidence coefficient of the interval $(y_4, y_8)$ for the third quartile. (ii) Consider the statement "There is a $100(1 - \alpha)$% chance that the third quartile lies in $(y_4, y_8)$". Is this statement correct? Justify your answer.

My attempt:

(i)

In order, the data is:

$7.63, 7.8, 8.33, 9.04, 10.86, 11.02, 11.32, 13.21$

The interval $(y_4, y_8)$ is $(9.04, 13.21)$. The coefficient, $c$, is therefore given by:

$c = P(Y_4 <$ third quartile $< Y_8)$

This probability (let us call it $N$) is merely binomial with parameters $n = 8$ and $p = 0.75$, hence:

$c = P(N = 4) + P(N = 5) + P(N = 6) + P(N = 7) = 0.87$

(ii)

With the information we have, the statement is:

"There is a $100(1 - 0.87)$% = 13% chance that the third quartile lies in $(9.04, 13.21)$."

I disagree. We cannot say that this interval contains the third quartile as this implies that it the quartile is a variable.

Is this correct? Any assistance is much appreciated.

1

There are 1 best solutions below

0
On

Comment:

I admit that I do not understand the rationale for this type of confidence interval for the upper quartile. In particular, the method for finding the coverage probability to be "13%" does not seem correct. Can you give the source of this method and do you know the rationale for it?

Even though there are not many observations, I might find a quantile-method 95% bootstrap CI $(9.0, 13.2),$ as shown below. The CI itself is not much different from yours, but I think I understand the rationale for my interval.

x = c(7.63, 7.8, 8.33, 9.04, 10.86, 11.02, 11.32, 13.21)
n = length(x)
summary(x)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 7.630   8.197   9.950   9.901  11.095  13.210 

set.seed(2021)
q3.re = replicate(3000, quantile(sample(x, n, rep=T), .75))
quantile(q3.re, c(.025,.975))
   2.5%   97.5% 
 8.9625 13.2100 

Also, data are not far from normal, according to a Shapiro-Wilk test and a (nearly linear) normal probability plot.

shapiro.test(x)

        Shapiro-Wilk normality test

data:  x
W = 0.91843, p-value = 0.4172

qqnorm(x);  qqline(x, col="blue")

enter image description here

So a quantile 95% parametric bootstrap CI $(9.4,12.8)$ may be useful:

set.seed(2021)
q3.re = replicate(5000, quantile(rnorm(n,mean(x),sd(x)),.75))
quantile(q3.re, c(.025,.975))
     2.5%     97.5% 
9.363067 12.828225