How to test if data diverge significant from exponential distribution using the Chi-squared test

96 Views Asked by At

How can I test, if the following data diverge from the exponential distribution with $\tau = 2.197$? The data has the form:

$0 \leq x \leq 0.5 :$ 194

$ 0.5 \leq x \leq 1 :$ 117

$1 \leq x\leq 1.5 :$ 111

$1.5 \leq x \leq 2.5 :$ 165

$2.5 \leq x \leq 4 :$ 163

$4\leq x :$ 139

I first tried to calculate $$\chi^2 = \frac{(194 -\frac{889}{6})^2}{\frac{889}{6}} + \frac{(117 -\frac{889}{6})^2}{\frac{889}{6}} + \dots. $$ but did not get the correct result of $T = \chi^2 = 9.786$. Can somebody pleas help me.

1

There are 1 best solutions below

0
On BEST ANSWER

I got reasonably close to the value of the $T$ statistic that you claim is correct. I will show my computations in R statistical software, and give some explanation of the computational procedure.

I will leave it to you to work this problem in whatever way is appropriate for your class. I have given enough intermediate steps that you can check your work along the way.

Here is the R code I used and a summary of the results:

p = diff(pexp(c(0,.5,1,1.5, 2.5,4,Inf), 1/2.197))
x = c(194,117,111,165,163,139)
n = sum(x); E = n*p;  n;  E
[1] 889  # total count
[1] 180.9504 144.1191 114.7845 164.2336 140.9672 143.9452
T = sum((x-E)^2/E); T
[1] 9.786033
c = qchisq(.95, 5); c
[1] 11.0705

Int.L = c(0,.5,1,1.5, 2.5,4)  # Lower ends of intervals
Term = (x-E)^2/E              # Term in sum
cbind(Int.L, x, p, E, Term)
     Int.L   x         p        E        Term
[1,]   0.0 194 0.2035437 180.9504 0.941100115
[2,]   0.5 117 0.1621137 144.1191 5.103029241
[3,]   1.0 111 0.1291165 114.7845 0.124778935
[4,]   1.5 165 0.1847397 164.2336 0.003576649
[5,]   2.5 163 0.1585683 140.9672 3.443656664
[6,]   4.0 139 0.1619181 143.9452 0.169891271

Here is how to compute the probability for the first interval: The CDF of $\mathsf{Exp}(mean=2.197)$ is $F(x) = 1-e^{-x/2.197},$ so the probability for the first interval is $$F(.5) - F(0) = (1 - e^{-.5/2.197}) - (1 - e^0) = 0.2035437,$$ which matches $p_1$ in my table.

The expected count $E_i$ for each interval is $E_i = np_i.$ Thus $E_1 = 889p_1 = 180.9504.$ This also agrees with the expected count in the first row of my table. (Expected counts should not be rounded to integers.)

Then the chi-squared test statistic $T$ is computed as:

$$T = \sum_{i=1}^6 \frac{(x_i - E_i)^2}{E_i}.$$

The first term in the sum is $(x_1 - E_1)^2/E_1 = 0.94110.$

In this problem, $T$ is approximately distributed as $\mathsf{Chisq}(df = 6-1 = 5).$

The 95th percentile of this distribution is the critical value $c = 11.07$ for a test at the 5% level of significance.

If $T > 11.07$ then you would reject the null hypothesis that the data fit the distribution $\mathsf{Exp}(mean=2.197).$ However, for this problem $T = 9.786033$ (my version) which is smaller than the critical value, so we do not reject the null hypothesis.

This test does not allow you to claim for sure that the data are from $\mathsf{Exp}(mean=2.197).$ However, your sample size $n =889$ is reasonably large, so it is reasonable to say that your data 'do not diverge significantly` from that population.