In the study and usage of statistics the idea that particular statistics will converge "almost certainly" to some value as the sample size $N$ diverges plays a key role (e.g. the central limit theorem, the law of large numbers). Thinking about the phenomenon of $p$-hacking has lead me to wonder: what are the limits of these theorems? Particularly, can this statement be proven:
Considering the family of all statistics that converge almost certainly to some "true" value if the sample size $N\rightarrow\infty$ while the number of statistics $n$ are are fixed, if we let $n=f(N)\rightarrow\infty$ and maintain a fixed criterion for the significance of a statistic then the probability of a type I error tends to 1.
What I mean by "family of statistics that converge almost certainly" is I'm only interested in the statistics where each statistic will converge to some "true" value almost certainly, when considered in isolation. And by "significant" I mean by any metric by which distance from the null hypothesis can be measured for a particular statistic considered in isolation (e.g. $p$-value [the probability that the null hypothesis would produce the observed result or greater]) that we use to test the hypothesis.
Basically, what I'm wondering is if it has been proven that the act of letting the number of hypotheses tested increase with sample size in some fashion (e.g. $n=N/2$ or $n=\sqrt{N}$) inevitably produces false positives.
Let $p>0$ the probability of type $I$ error in one test and assume that $X$ counts the number of false-positives. Assume independence for mathematical inconvenience, then for $n$ tests, $X_n \sim Bin(n,p)$, so $$ \lim_{n\to\infty}\mathbb{P}(X_n \ge 1) = 1 - \lim_{n\to\infty}\mathbb{P}(X_n = 0)=1-\lim_{n\to\infty}(1-p)^n = 1. $$