Confidence intervals vs law of the iterated logarithm

246 Views Asked by At

Let $X_i$ be iid random variables with mean $\mu$ and variance $1$. Let $S_n=\frac 1n \sum_{k=1}^n X_k$ denote the sample mean. By CLT it is well known that $(S_n - \frac{1.96}{\sqrt n}, S_n + \frac{1.96}{\sqrt n})$ is an asymptotic $95\%$ confidence interval for $\mu$.

By the law of the iterated logarithm, $\frac{\sqrt n}{\sqrt {\log \log n} }(S_n-\mu)$ has $\limsup$ equal to $\sqrt 2$ almost surely. Thus for almost all $w$, we have $\frac{\sqrt n}{\sqrt {\log \log n} }(S_n-\mu) \approx \sqrt 2$ infinitely often, thus $ S_n \approx \mu +\frac{\sqrt {2 \log \log n}}{\sqrt n }$ infinitely often.

For large $n$, $\sqrt {2 \log \log n}$ is much bigger than $1.96$, which seems to contradict the confidence interval. Can someone shed light on this paradox (if there is any) ?

2

There are 2 best solutions below

0
On

Let $X_1,X_2,\dots $ iid with $X_1\sim \mathcal{N}(0,1)$ then $$ W_n=\frac{\sqrt{n}}{\sqrt{2\log \log n}}S_n \sim \mathcal{N}\left(0,\frac{1}{2\log \log n}\right).$$ Note $$W_n \to 0$$ in law, but $$ W_n\nrightarrow $$ in the sense a.s. convergence. Indeed, by the law of iterated logarithm $$ \liminf W_n =-1,\quad \limsup W_n =1$$ with probability one.

0
On

I know this is old, but I think this problem perfectly encapsulates misunderstandings with the CLT which I have been thinking about a lot.

The CLT says that $\sqrt{n}\left(S_n - \mu\right)$ converges to $\mathcal{N}(0, \sigma^2)$ in distribution. This is key, because it actually says nothing about the convergence of $\sqrt{n}\left(S_n - \mu\right)$ for any given sample, i.e. any given series of measurements of the $X_i$, as $n\to \infty$. In fact, you can see from the LIL that, seen as a statistic of a sample of size $n \to \infty$ (i.e. a test where keep collecting more data), $\sqrt{n}\left(S_n - \mu\right)$ diverges with probability 1! This is in stark contrast from the (Strong) Law of Large Numbers, which says that $S_n$ converges to $\mu$ with probability $1$.

The way to think about the convergence in distribution in the CLT is that, the larger $n$ gets, the more that the distribution of the random variable $\sqrt{n}\left(S_n - \mu\right)$ looks normal. In other words, if you fix some large $n$, and perform a bunch of different tests where, in each, you collect a sample of size $n$ and then calculate $\sqrt{n}\left(S_n - \mu\right)$, then all of these different calculations will be distributed approximately like $\mathcal{N}(0, \sigma^2)$, and that approximation will get better the bigger you make $n$.

Again, a core difference between the CLT and the LIL and LLN are that the latter two give almost sure convergence, whereas the former is only convergence in distribution. But where analysis trains us to think about this as just being a weaker statement, it actually means that we have a fairly different interpretation. Namely, the LIL and LLN can tell you what happens with their respective statistics as you collect more and more data for a given sample, and thus enable you to say things about the distribution of these statistics viewed as random variables for fixed large sample sizes. For example, since we know that $S_n$ will converge to $\mu$ as the sample grows, with probability $1$, we know that in fact the random variables $S_n$ converge to the constant function $\mu$. The full statement of the LLI tells you that its statistic doesn't converge anywhere almost surely, although it converges in probability to $0$ (the Wikipedia discussion is worth viewing here).

On the other hand, the CLT doesn't tell you anything about $\sqrt{n}\left(S_n - \mu\right)$ as a statistic of a sample going to infinity, which is good because LIL says that it doesn't converge. However, it tells us how to consider the distribution of this as a random variable for a fixed sample size $n$, but with the knowledge that its statement is an approximation that is more accurate the larger $n$ is.