How can I find a bound for these states by the CLT?

69 Views Asked by At

I have the following question:

Thee are $1000$ independent random variables $X_{1},X_{2},\ldots,X_{1000}$, where $X_{i}$ is uniformly distributed on $[0,1)$. Furthermore, let $$ Y_{i} = \begin{cases} 0, & x_{i} < 0.5 \\ 1, & x_{i} \geq 0.5 \end{cases} \;. $$ How can I find the following state by the central limit theorem? $$ P\left( \left| \sum_{i=1}^{1000} X_i- \sum_{i=1}^{1000} Y_i \right| \ge 7 \right) \qquad\text{and}\qquad P\left( \sum_{i=1}^{1000} \left| X_i- Y_i \right| \ge 7 \right) $$ Thank you for any help.

2

There are 2 best solutions below

0
On

I am going to assume that $\to 7$ is $ \ge t$ for some $t \ge 0$. Let $Z_i = X_i - Y_i$. Then, $Z_i = X_i - 1\{X_i \ge 0.5\}$, hence $$ \mathbb E Z_i = \mathbb E X_i - \mathbb P(X_i \ge 0.5) = 0.5 - 0.5 = 0. $$ and \begin{align} var(Z_i) = \mathbb E (Z_i^2) &= \mathbb E X_i^2 - 2\mathbb E X_i 1\{X_i \ge 0.5\} + \mathbb E 1\{X_i \ge 0.5\}\\ &= \int_0^1 x^2 dx - 2 \int_{0.5}^1 x dx + 0.5 =:v \end{align}

By CLT, we have $$ \mathbb P( |\sum_{i=1}^n Z_i \Big| \ge \sqrt{nv} t) \approx \mathbb P (|W| \ge \sqrt{nv} t) $$ where $W \sim N(0,1)$. You can proceed similarly for $\mathbb P(\sum_{i=1}^n |Z_i| \ge t)$, but note that $\mathbb E |Z_i| \neq 0$ in this case.

0
On

Suppose we looked at $Z_i=X_i-Y_i$. This would be uniformly distributed on $\left[-\dfrac12,\dfrac12\right)$ so would have mean $0$ and variance $\dfrac1{12}$. Then $\displaystyle S= \sum_{i=1}^{1000} X_i- \sum_{i=1}^{1000} Y_i = \sum_{i=1}^{1000} Z_i$ and since the $Z_i$ are i.i.d. the sum $S$ has mean $0$ and variance $\dfrac{1000}{12}$

Approximating by the normal distribution using a Central Limit Theorem argument, you get $$P(|S| \ge 7) \approx 1 - \Phi\left(\dfrac{7}{\sqrt{1000/12}}\right) + \Phi\left(\dfrac{7}{\sqrt{1000/12}}\right) \approx 2 \Phi(-0.7668) \approx 0.4432$$

As an aside, this would be equivalent to $1-G_{1000}(7)$ in my note May not sum to total due to rounding: the probability of rounding errors though I only did the exact calculations up to $G_{100}(5.5)$

For the second part of your question, if you have $A_i=|X_i-Y_i|=|Z_i|$, then $A_i$ is uniformly distributed on $\left[0,\dfrac12\right]$ so non-negative and with mean $\dfrac14$ and variance $\dfrac{1}{48}$, so the sum $\displaystyle T= \sum_{i=1}^{1000} |X_i- Y_i| = \sum_{i=1}^{1000} A_i$ and since the $A_i$ are i.i.d. the sum $T$ has a mean of $\dfrac{1000}{4}$ and variance $\dfrac{1000}{48}$. Approximating by the normal distribution using a Central Limit Theorem argument, you get $$P(T \ge 7) \approx 1 - \Phi\left(\dfrac{7-1000/4}{\sqrt{1000/48}}\right) \approx 1-\Phi(-53.2386)\approx 1$$ which is extremely likely, as you might intuitively expect