normality of data

224 Views Asked by At

Does the qqplot below suggest that the data is normally distributed?enter image description here

The fact that it's nearly perfectly linear is to me an indication of normality. However, the Anderson-Darling test for some reason rejects the null hypothesis (normality). Why is that?

(The data is for a certain MLE that I know for a fact is asymptotically normal, but only asymptotically. So basically what I'm trying to do here is to find out how quickly, i.e., for what min sample sizes, this asymptotic normality takes effect. The qqplot and the AD test give two antagonistic answers.).

1

There are 1 best solutions below

0
On

Here's a fact about hypothesis testing: For large sample sizes, almost any deviation from the null will be labeled "significant"..even if its slight. If a statistic is asymptotically normal, that does not ensure that the AD test will fail to reject, since it will never, in fact, be normal. Hence, with a large sample size, all you are doing is amping up the power of the normality test while also feeding it a non-normal distribution.

From your plot, your data is very well approximated by a normal. As a bit of applied-stats advice: when fitting distributions, rely more on visual/qualitative criteria as opposed to all-or nothing tests. I know we all like to have clear answers to questions, but in the case of testing fit, it's often better to just look at the qq plot and a histogram to validate your assumption. In the end, all statistical models are wrong (apart from contrived examples), but some are much less wrong than others. That's what you want to establish.