I've written the following code in R program:
n=10
delta <- vector()
for (i in 1:10000){
x <- rt(n, 1)
x <- sort(x)
z <- pnorm(x, mean(x), sd(x))
delta[i] <- sum(abs(z-seq(1,n)/n))/sum(pmax(z, seq(1,n)/n))
}
I want to compute the statistical power of the proposed normality test via simulation studies with 10000 iterations, assuming different sample sizes and various alternative distributions. For a fixed sample size n and for a pre-assumed alternative distribution, (like t-student with degree of freedom 1), the power of the test is computed as the percentage of samples out of 10000 Monte-Carlo simulated samples from the pre-assumed distribution which were rejected to follow normal distribution based on the used test. I was wondering if someone could tell me how I can compute the statistical power of this normality test in R?
Suppose a test of normality is performed at the 5% level. This means that it will reject (have P-value below 5%) with probability $0.05$ for a random sample from a normal population. (A test at significance level 5%.)
One hopes that the test rejects with larger probability for a random sample from a non-normal distribution (its power).
No normality test has good power for all non-normal distributions, but the Shapio-Wilk test works reasonably well for a wide variety of non-normal distributions. This test is implemented in R as
shapiro.test.Below I show how to use R to find the power of the S-W test for samples of size $n=10$ from Student's t distribution with $\nu = 1$ degrees of freedom (a standard Cauchy distribution).
For one such sample, the S-W test in R looks like this. Notice that S-W does not reject for this particular sample.
one can use
$-notation to get just the P-value:In your question, you propose using $10^4$ iterations, but I think using $10^5$ iterations is more likely to give useful results.
Here is a simulation in R to approximate the power of the S-W normality test against the alternative that the sample is from $\mathsf{T}(\nu = 1).$ The power is about $0.5912\pm 0.0031.$
A run with samples of size $n = 20$ (not shown) gave about 87% power. Thus the S-W normality test is not so good at rejecting Cauchy samples of size 10, and considerably better at rejecting such samples of size 20.
Notes: (1) The numeric vector
pvcontains $100\,000$ P-values. The logical vectorpv <= 0.05contains $100\,000$TRUEs andFALSEs (True upon rejection). Itsmeanis its proportion ofTRUEs.The last line of code shows an approximate 95% margin of simulation error; we are getting almost 2-place accuracy from $10^5$ iterations. (The asymptotic Wald CI is used; it works well for $100,000$ iterations.)
(2) While this style of simulation using
$-notation ensures that I am using exactly the version of the Shapiro-Wilk test that is implemented in R, and the code is easy to write. However, this is not the fastest possible code because R formats the entire test output at each iteration, while we use only the P-value.(3) If you can write a function that returns the P-value of your test of normality, then you can use analogous R code (with
replicate) to explore its power.