Confidence Interval for Variability

40 Views Asked by At

The hydrocarbon emissions are known to have decreased dramatically during the 1980s.

A study was conducted to compare the hydrocarbon emissions at idling speed, in parts per million (ppm), for automobiles of 2006 and 2016. Sample of each year model were randomly selected and their hydrocarbon emission levels were recorded. The data are as follows:

Data Given:

  • 2006|295,545,236,388,290,352,391,291,206

  • 2016|281,279,212,157,241,121,275,134

Assume that the hydrocarbon emission levels are normally distributed.

Construct a 90% confidence interval for the variability in hydrocarbon emission of cars 2006 and 2016.

I am confused about which formula should i use to solve this problem, should i use the formula of confidence interval of the difference between two samples?

$(\bar x_1-\bar x_2) \pm t_{\alpha/_2,n_1+n_2-2}*S_p \sqrt {\frac 1{n_1}+ \frac 1{n_2}}$

1

There are 1 best solutions below

0
On

The formula you show uses population variances to make a confidence interval for the difference of population means, $\mu_1 - \mu_2.$ It is not relevant for finding CIs for variances.

This answer assumes your emission data for each year are normally distributed. I show 95% confidence intervals. Because CIs for variances tend to be long, some applied statisticians use 90% confidence intervals, which are sommewhat shorter. I hope you can adapt the discussions below for 90% CIs.

Individual variances. To get a CI for the variance in '06:

x.06 = c(295,545,236,388,290,352,391,291,206)
v = var(x.06);  v
[1] 10286

Use $\frac{(n-1)S^2}{\sigma^2} \sim \mathsf{Chisq}(n-1).$ Thus $$.95 = P\left(L \le \frac{(n-1)S^2}{\sigma^2} \le U\right)\\ = P\left(\frac{(n-1)S^2}{U} \le \sigma^2 \le \frac{(n-1)S^2}{L} \right),$$ where $L$ and $U$ cut 2.5% of probability from lower and upper tails, respectively, of the distribution. Thus a 95% CI for $\sigma^2$ is of the form $$\left(\frac{(n-1)S^2}{U},\, \frac{(n-1)S^2}{L} \right),$$ which can be computed in R as follows:

 8*v/qchisq(c(.975,.025), 8)
 [1]  4692.907 37751.452

Notice that the sample variance $v = S^2 = 10286$ lies within the CI $(4692.907,\, 37751.452),$ but not at its center. If you want a 95% CI for the population standard deviation, take square roots of endpoints.

You can get a 95% CI for the population variance in 2016 similarly. Also, Minitab statistical software has a relevant 'single-variance' procedure. Here is output (slightly edited for brevity) from this procedure for finding the confidence interval for the population variance in 2016:

CI for One Variance 

Method

The chi-square method is only for the normal 
distribution.

Statistics

N  StDev  Variance
8   67.0      4493

95% Confidence Intervals

                CI for         CI for
Method          StDev         Variance
Chi-Square  (44.3, 136.4)  (1964, 18610)

If you want a 95% CI for the ratio $\sigma_{.06}^2/\sigma_{.16}^2,$ then you can use the F-distribution.

This computation is part of the R procedure var.test, which gives the 95% CI $(0.467.\, 10.368)$:

x.06 = c(295,545,236,388,290,352,391,291,206)
x.16 = c(281,279,212,157,241,121,275,134)

var.test(x.06,x.16)

        F test to compare two variances

    data:  x.06 and x.16
    F = 2.2896, num df = 8, denom df = 7, p-value = 0.2917
    alternative hypothesis: true ratio of variances is not equal to 1
    95 percent confidence interval:
      0.4673195 10.3684028
    sample estimates:
    ratio of variances 
              2.289557 

Note: Formulas for both kinds of CIs are given in most applied elementary statistics books and online.