I have two sets of 1000 $L_2$-normalized vectors (each being 512 elements long). I compute the similarities of each vector from one set with each vector in the other set using the inner product, which yields a total of 1.000.000 similarity-computations between vectors. When i construct a histogram out of the resulting similarities i get the following image:
which looks very much like a normal distribution to me. When i manually construct a normal-distribution with $\mu$ and $\sigma$ of the empirical data i get the following plot:
from scipy import stats
mu = similarities.mean()
variance = similarities.var()
sigma = similarities.std()
x = np.linspace(mu - 5*sigma, mu + 5*sigma, 100)
sns.histplot(without_diag, stat='density');
plt.plot(x, stats.norm.pdf(x, mu, sigma), 'r', alpha=0.6);
Which seems to fit just fine. I used numpy to perform the computation and seaborn to plot the histogram. I used scipy.stats.normaltest to test, if the samples come from a normal-distribution. Result:
>>> k2, p = stats.normaltest(without_diag)
>>> print(p)
3.2035057152116813e-221
I read somewhere, that $p$ should be below $0.05$ in order to reject the null-hypothesis, that the data was sampled from a normal-distribution and $3\times 10^{-221}$ seems very much to be below that border. Does it have to do with the number of samples? Is there a name for this kind of distribution?
EDIT:
Here is the normal probability-plot suggested in the answer by @heropup:



Rather than plotting a histogram, it may be more informative to create a normal probability plot. This may show deviations for normality that might be hidden in a histogram; e.g. tails too heavy or sparse.
A caveat regarding statistical tests of non-normality is that, like many statistical tests, the power to reject becomes greater as the sample size increases. With $n = 10^6$ this means that even a small deviation from normality may be detected by such a test.
That said, the resulting $p$-value you obtained does suggest deviation from normality; however, whether the extent of such deviation is meaningful--i.e., "effect size" as it pertains to measures of goodness-of-fit--is a separate question.