Weird behaviour of CLT's application to binomial.

43 Views Asked by At

I am carrying out the simulations of the following experiment for all $n$ in the set $\{1,2,3,...,100\}$.

(0) Set $k=0$.

(1) Generate $n$ $Bernoulli(0.9)$ trials.

(2) Construct estimate $\hat\theta=\frac{1}{n}\sum^{n}_{i=1}x_i$.

(3) Construct confidence interval of $95\%$, i.e. $$\bigg[\hat\theta-\frac{\sqrt{\hat\theta(1-\hat\theta)}}{\sqrt{n}}\Phi^{-1}(0.975), \ \ \hat\theta+\frac{\sqrt{\hat\theta(1-\hat\theta)}}{\sqrt{n}}\Phi^{-1}(0.975)\bigg]$$

(4) Check whether $0.9$ is in the confidence interval. If yes, add $1$ to $k$.

(5) Repeat steps $1-4$ 100,000 times and calculate $k/100,000$.

Now, I would expect that accordingly to the Central Limit Theorem, as $n$ is bigger, the simulation would be more reasonable (common saying that $n>30$ would suffice), that is I would expect to get something around $0.95$ through the simulation. But when I plotted a graph of $n$ on the x-axis and the value obtained in the 5'th step of simulation (i.e. $k/100,000$), I have observed a very strange behaviour (see picture below). Could anyone explain it?

enter image description here

Below is the same simulation procedure but for $Bernoulli(0.5)$ rather than $0.9$.

enter image description here

Below is the graph of the simulation with $Bernoulli(0.9)$ with t-statistic rather than Normal.

enter image description here