I have two samples $X_1 \; (N= 97)$ and $X_2 \; (N=4782)$ drawn from the same population data. I like to test (using Statistical Visualizations such as normplot and qqplot, and Hypothesis Tests such as jbtest, chi2gof, and kstest in MATLAB) if the data from each sample is normally distributed.
My First Data is
X = [8.13010235400000,13.6713071300000,14.0362434700000,18.4349488200000,26.5650511800000,30.9637565300000,34.3803447200000,40.6012946500000,45,49.3987053500000,58.6713071300000,59.0362434700000,59.0362434700000,59.0362434700000,61.9275130600000,61.9275130600000,63.4349488200000,63.4349488200000,63.4349488200000,63.4349488200000,63.4349488200000,64.4400348300000,71.5650511800000,71.5650511800000,71.5650511800000,71.5650511800000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,75.9637565300000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,77.4711922900000,78.6900675300000,90,90,90,90,90,90,90,90,90,90,90,90,90,90,90,93.1798301200000,97.1250163500000,97.7651660200000,102.528807700000,102.528807700000,102.528807700000,102.528807700000,102.528807700000,104.036243500000,104.036243500000,104.036243500000,104.036243500000,104.036243500000,104.036243500000,104.036243500000,105.255118700000,108.434948800000,108.434948800000,108.434948800000,108.434948800000,109.440034800000,116.565051200000,118.072486900000,120.963756500000,127.746805400000,130.601294600000,135,137.489552900000,139.398705400000,139.398705400000,149.036243500000,153.434948800000,159.227745300000,161.565051200000,179.999998800000,180];
The analyses using statistical visualizations in MATLAB, shows that the underlying distribution for both samples are normal. However, from the hypothesis tests, the null hypothesis for the $X_1$ sample is not rejected using the same significance value (except for the chi-square test), but that for the $X_2$ is completely rejected.
I am now confused as to how to prove my samples are normally distributed and as well come from the same population data. Please, what can I do in this situation?
PS : Sample $X_2$ is too large for me to post, but if there is any suggestion on how I could show this, then I don’t mind.
Attaching Image


You should use a Shapiro-Wilk Test. Let
$$H_0 : \text{ the data is normally distributed}$$
$$H_a : \text{ the data is not normally distributed}$$
R statistical software gives
Since $0.01061<0.05$ we have significant evidence at $\alpha=0.05$ to reject the null hypothesis and conclude that the data is non-normal.
A Q-Q Plot supports our conclusion: