Testing whether $X_1, \dots, X_n$ are sampled from standard normal distribution

87 Views Asked by At

What are the possibilities to test whether R.V. come form normal distribution i.i.d.:

$$X_1, \dots, X_n \sim N(0,1)$$

What we need to measure and what is the MVUE statistic to test $H_0$ that they are vs. $H_1$ that they are not ?

What is the size and power of this test ?

1

There are 1 best solutions below

3
On

This more of a rebuttal than an answer. I hope to convince you that you need to state the purposes of your project, take sample size into account, and say what alternatives to standard normal are important to you. Whatever test(s) you consider, you will need additional context.

One major difficulty with tests of normality is that they may have very poor power (ability to detect departure from normal) for small samples, and that they may reject for a large sample that is very nearly normal, but not in ways that matter for practical purposes.

Another is that the alternative can make a large difference in the behavior of a test of normality. And that you do not state an alternative. Is your alternative that the data are not normal even though the population from which the test sample were randomly sampled has mean other than $\mu = 0, \sigma=1?$ Or is the alternative that the population is not normal?

Consider several scenarios using the Kolmogorov-Smirnov test of $H_0$ that data are sampled from $\mathsf{Norm}(0,1).$

Small samples: $n = 10.$ In none of the cases is the null hypothesis that data are standard normal rejected. This is the appropriate result only for the first test, in which data are standard normal. Following @GolderRatio's suggestion a t test is included in the one instance in which $\mu \ne 0.$ [R code. For brevity, $-notation is used to show only P-values for most of the tests.]

set.seed(1022)
n = 10
x = rnorm(n)             # standard normal data
ks.test(x, pnorm, 0, 1)  # appropriately fails to reject

        One-sample Kolmogorov-Smirnov test

data:  x
D = 0.11601, p-value = 0.9966
alternative hypothesis: two-sided

y = rnorm(n, .25, 1)      # Normal, but mean 0.25
ks.test(y, pnorm, 0, 1)$p.val
[1] 0.1230996             # Falsely fails to reject
t.test(y)$p.val                     
[1] 0.2673517             # t test fails to rej wrong mean 
t = rt(n, 50)             # pop not exactly normal
ks.test(y, pnorm, 0, 1)$p.val
[1] 0.1230996             # slight non-normality not detected
u = runif(n, -sqrt(3), sqrt(3))  # uniform population     
ks.test(u, pnorm, 0, 1)$p.val   
[1] 0.1454823             # fail to detect non-normality 
w = rexp(n) - 1           # shifted exponential data
ks.test(w, pnorm, 0, 1)$p.val
[1] 0.4284639             # fail to detect non-normality

Large samples: $n = 1000$ For large samples all but the first test with standard normal data are strongly rejected with P-values very nearly $0$---even for the 'essentially' normal t data with DF=50.

set.seed(1022)
n = 1000
x = rnorm(n)
ks.test(x, pnorm, 0, 1)

        One-sample Kolmogorov-Smirnov test

data:  x
D = 0.029514, p-value = 0.3484
alternative hypothesis: two-sided

y = rnorm(n, .25, 1)
ks.test(y, pnorm, 0, 1)$p.val
[1] 5.050318e-08
t.test(y)$p.val
[1] 9.776068e-11
t = rt(n, 50)
ks.test(y, pnorm, 0, 1)$p.val
[1] 5.050318e-08
u = runif(n, -sqrt(3), sqrt(3))
ks.test(u, pnorm, 0, 1)$p.val
[1] 0.000140186
w = rexp(n) - 1
ks.test(w, pnorm, 0, 1)$p.val
[1] 0