How to use Chebyshev's inequality and CLT to determine bounds.

1k Views Asked by At

Was doing practice problems in preparation for an examination. Ran into a problem in a book that requests us to determine the sample size needed to fit the probability of rolling a specific die face (lets say 1) between [0.111111, 0.222222] with 95% confidence interval using CLT and Chebyshev Inequality.

For Chebyshev, what I have so far is

$$Pr\left(|\overline{X} − \mu| < \frac{kx}{\sqrt{n}}\right) \geq 1−\frac{1}{k^2}.$$ I have determined that $k$ must be 4.47 in order to fit into a 95% interval. In which case the inequality would be $<4.47\sigma$. However, I am not sure how to proceed.

For CLT, I am unsure as to how to proceed.

1

There are 1 best solutions below

0
On BEST ANSWER

Well, you have found $k=\sqrt{20}\approx 4.472135955$. For one trial the probability of getting 1 on a die is $1/6$. You have $n$ independent random variables $X_1,\ldots,X_n$ where $X_i=1$ is you see the specific die face (1) in $i$-th roll of a die, and $X_i=0$ if the other faces appeared in $i$-th roll. The sum $X_1+\ldots+X_n$ represents total number of times when 1 is shown in $n$ rolls.

Then $$Pr(X_i=1)=\frac16, \quad Pr(X_i=0)=\frac56, \quad E[X_i]=\frac16, \quad Var[X_i]=\frac{1}{6} \cdot \frac{5}{6}$$

Therefore, standard deviation for $X_i$ is $\sqrt{\frac{1}{6} \cdot \frac{5}{6}}=\frac{\sqrt{5}}{6}$ and SD for $\overline X$ is $\sqrt{n}$ times smaller: $$ \sigma=\frac{\sqrt{5}}{6\sqrt{n}} $$ And $$\mu=E[\overline X]=E[X_1]=\frac16.$$

We can rewrite Chebyshev inequality as (keep in mind that $k=\sqrt{20}$) $$ Pr\left(|\overline{X} − \tfrac16| < \tfrac{k\sqrt{5}}{6\sqrt{n}}\right) = Pr\left(|\overline{X} − \tfrac16| < {k\sigma}\right) \geq 1−\frac{1}{k^2}. $$

Let us compare the desired interval $0.111111\leq \overline X \leq 0.222222$ with interval from Chebyshev inequality $|\overline{X} − \tfrac16| < {k\sigma}$. The value $\mu=1/6=0.166666$ is exactly the middle point of interval $[0.111111, 0.222222]$. Both intervals coincide when ${k\sigma}=0.222222-0.166666=0.055555$:

Finally we can find $n$ from the equation $$ 0.055555=k\sigma=\sqrt{20}\frac{\sqrt{5}}{6\sqrt{n}} \Rightarrow \sqrt{n}=30 \Rightarrow n=900. $$

Using CLT: with $\sigma=\sqrt{\tfrac{Var[X_1]}{n}}$ $$ Pr\left(|\overline{X} − \mu| < k\sigma\right) \approx \Phi(k)-\Phi(-k)=2\Phi(k)-1 $$

We need $2\Phi(k)-1 = 0.95$, $\Phi(k)=0.975$ and $k=1.96$.

Then solve the equation $0.055555=k\sigma$ again with $k=1.96$ $$0.055555=k\sigma=\tfrac{1.96\sqrt{5}}{6\sqrt{n}}\quad \Rightarrow \quad \sqrt{n}=13.148 \quad \Rightarrow \quad n=172.9 $$ Round to the larger integer value give $n=173$.