Digits: Convergence of number of $00$

76 Views Asked by At

Let $n\ge2$ and $X_1,X_2,\ldots,X_{n+1}$ be i.i.d Bernoulli sequence $\mathcal{B}(\frac1{2}).$

Let $N=\sum_{i=1}^n Y_i$ be the number of $00$ in the sequence i.e. $Y$ is a discrete random variable such that $$Y_i=1 \;\mbox{if}\; X_{i}=0\;\mbox{and}\; X_{i+1}=0; \quad Y_i=0\; \mbox{otherwise}.$$

Does $$\frac{N-E(N)}{\sqrt{V(N)}}\overset{d,n\to\infty}{\to} \mathcal{N}(0,1)?$$

I cannot apply the CLT because $Y_i$ are dependent.

1

There are 1 best solutions below

0
On

Yes. I'm not sure the best way to prove it, here is one approach (perhaps someone else can give a better/shorter proof?). The idea is that $(Y_1+Y_2+Y_3)$ and $(Y_5+Y_6+Y_7)$ are i.i.d. since $Y_4$ is "missing." Well if we pull out every 4th $Y_i$ that is a significant change, but what if we pull out every $k$th $Y_i$ and let $k$ be large?


Fix $k$ as an integer larger than 2 and assume $n=Mk$. So $$ N := \sum_{i=1}^n Y_i = \sum_{m=0}^{M-1}\sum_{j=1}^k Y_{mk+j} = \left(\sum_{m=0}^{M-1}\sum_{j=1}^{k-1} Y_{mk+j}\right) + \sum_{m=1}^{M} Y_{mk} $$ So $$ R := N-E[N]=\sum_{i=1}^n (Y_i - E[Y_i]) = \left(\sum_{m=0}^{M-1}\underbrace{\sum_{j=1}^{k-1} (Y_{mk+j}-E[Y_{mk+j}])}_{i.i.d.}\right) + \sum_{m=1}^{M} \underbrace{(Y_{mk}-E[Y_{mk}])}_{i.i.d.}$$ So $Var(N) = E[R^2]$. Define \begin{align*} Z_m &= \sum_{j=1}^{k-1} (Y_{mk+j}-E[Y_{mk+j}])\\ V_m &= Y_{mk}-E[Y_{mk}] \end{align*} So $\{Z_m\}$ are i.i.d. zero mean, $\{V_m\}$ are i.i.d. zero mean, $Z_m$ and $V_j$ are independent for $|m -j|\geq 2$, and $$ R = \sum_{m=0}^{M-1} Z_m + \sum_{m=1}^M V_m $$ By the central limit theorem we have $$ \frac{R}{\sqrt{ME[Z_1^2]}} = G_M + H_M $$ where $G_M$ is zero mean and unit variance, $H_M$ is zero mean and variance $\frac{E[V_1^2]}{E[Z_1^2]}$, and the distributions of $G_M$ and $H_M$ converge to Gaussian as $M\rightarrow\infty$ (unfortunately the $G_M$ and $H_M$ are not necessarily independent). So $$ \frac{N-E[N]}{\sqrt{Var(N)}} =\frac{R}{\sqrt{ME[Z_1^2]}}\frac{\sqrt{ME[Z_1^2]}}{\sqrt{Var(N)}}= (G_M+H_M)\frac{\sqrt{M E[Z_1^2]}}{\sqrt{Var(N)}} $$ Further, it can be shown that for all $M$ we have \begin{align*} &\lim_{k\rightarrow\infty} \frac{E[V_1^2]}{E[Z_1^2]}=0\\ &\lim_{k\rightarrow\infty} \frac{M E[Z_1^2]}{Var(N)} = 1 \end{align*} So if $M$ and $k$ are both large we have $$ \frac{N-E[N]}{\sqrt{Var(N)}} = (G_M+H_M)\underbrace{\frac{\sqrt{M E[Z_1^2]}}{\sqrt{Var(N)}}}_{\approx 1} $$ where $G_M$ is approximately Gaussian in distribution with mean 0 and variance 1, while $H_M$ is negligible since it is approximately Gaussian in distribution with mean 0 and variance that goes to 0 with large $k$.


Hopefully that was convincing, here are some pesky details I am skipping:

  • This only considers $n$ of the form $n=Mk$. A detail is to show that for general integers $n=Mk+r$, the affects of the "residual" $r$ is asymptotically negligible.

  • Details on the "it can be shown" parts.

  • Details on being precise about taking "both $M$ and $k$ large."