Basic question about significance of statistical tests

88 Views Asked by At

Apologies for the basic question this is really not my area at all but I’m trying to help a friend out.

Whilst reading the Wikipedia page for the Shapiro-Wilk test I came across the following: “As with most statistical tests, the test may be statistically significant from a normal distribution in any large samples. Thus a Q–Q plot is useful for verification in addition to the test”

I interpret this to mean that if we sampled a large amount of data from what was in fact a Normal population, the test may in fact reject the null hypothesis that the parent population was Normal. Is this interpretation correct?

If so, why is this the case? I thought in general larger samples gave better testing?! Any intuition on this would be very much appreciated.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $SW$ be the Shapiro-Wilk statistic, and $P = F(SW)$ be its p.value. As $$ F(P \le p) = F(F(SW) \le p) = F( SW \le F^{-1}(p)) = F(F^{-1}(p)) = p, $$ hence $F(SW) \sim U[0,1]$. Namely, under $H_0$ the p.value is distributed uniformly on $[0,1]$ and thus, as you reject the null hypothesis where $p.value < 0.05$, you have probability of $0.05$ to falsely reject $H_0$ (given that $H_0$ is true). This is true regardless of the sample size.

The following simulation illustrates the distribution of $10^4$ SW's p.values for sample size of $1000$ each where the data comes from $\mathcal{N}(0,1)$.

  mat = matrix( rep(NA, 10^4 * 10^3), nrow = 10^3 )
for(i in 1 : ncol(mat)){
  mat[ ,i] = rnorm(10^3, 0, 1)
}

shap.vec = numeric()                

for(i in 1 : 10^4) { 
   shap.vec[i] = shapiro.test( mat[, i] )$p.value 
}

hist( shap.vec,
      breaks = 100,
      col    = "blue",
      prob   = T,
      main   = "p.values of SW test under H0",
      xlab   =  "p.values" ) 

sum( ifelse( shap.vec < 0.05, 1, 0) ) / length( shap.vec )

[1] 0.0517

enter image description here