Epsilon-Delta analysis for the Law of Large Numbers?

108 Views Asked by At

Law of Large Numbers: For a sequence of independent and identically distributed random variables $X_1, X_2, X_3, ..., X_n$, each with an expected value $E[X_i] = \mu$ and sample estimator $\overline{X_n}$ = $\frac{1}{n} \sum_{{i=1}}^{n} x_i$ :

$$\lim_{{n \to \infty}} P\left( \left| \overline{X_n} - \mu \right| \geq \epsilon \right) = 0$$

My Question: Provided $X_1, X_2, X_3, ..., X_n$ are all iid and all come from some specific Probability Distribution Function $g(x; \theta)$ : What value of $n$ is required to "probabilistically" achieve a certain value of $\epsilon$?

As an example, suppose I have 100 randomly generated iid data points. I believe these points came from a Normal Distribution. Suppose there was some "magic" way of knowing that these $n$ = 100 data points were actually generated from a Normal Distribution with $\mu=5$,$\sigma=5$ .

Then, on average (i.e. if we had access to the infinite universe of all samples of size $n$ = 100 from a Normal Distribution with $\mu=5$,$\sigma=5$ ) : For $n$ = 100, what would be the average value of $\epsilon$?

Can this be answered using Cramer's Theorem of Large Deviations? https://en.wikipedia.org/wiki/Cram%C3%A9r%27s_theorem_(large_deviations)?

Thanks!

  • Note: For $n$ = 100 ,would this average value of $\epsilon$ vary for different Normal Distributions? e.g. ($\mu=5$,$\sigma=5$) vs ($\mu=2$,$\sigma=3$)
  • For $n$ = 100, would this average value of $\epsilon$ vary for different Probability Distributions Functions? e.g. Normal Distribution vs Exponential Distribution vs Gamma Distirbution, etc?

Follow up Question: Applying Hoeffding's Inequality in Real Life

1

There are 1 best solutions below

12
On BEST ANSWER

Fix $\epsilon>0$. Suppose $\{X_i\}_{i=1}^{\infty}$ are i.i.d. with mean $\mu$ and variance $\sigma^2>0$. Define the random variable: $$M_n = \frac{1}{n}\sum_{i=1}^n X_i$$

Claim 1: For all $n\in \{1, 2, 3, ...\}$ we have $E[M_n]=\mu$ and $Var(M_n)=\sigma^2/n$.

Claim 2 (Normal): If $X_i\sim N(\mu,\sigma^2)$ then $M_n\sim N(\mu,\sigma^2/n)$. If we define $G \sim N(0,1)$ then $M_n-\mu$ has the same distribution as $\frac{G\sigma}{\sqrt{n}}$ and
$$ P[|M_n-\mu|\geq \epsilon] = P\left[|\frac{G\sigma}{\sqrt{n}}| \geq \epsilon\right] = P\left[|G|\geq \frac{\epsilon \sqrt{n}}{\sigma}\right] = 2Q\left(\frac{\epsilon\sqrt{n}}{\sigma}\right)$$ where $Q(\cdot)$ is the Gaussian tail function.

Claim 3 (Bounded): If $a\leq X_i\leq b$ almost surely then the Hoeffding inequality here: https://en.wikipedia.org/wiki/Hoeffding%27s_inequality

says $$P[|M_n-\mu|\geq \epsilon] \leq 2\exp\left(\frac{-2n\epsilon^2}{(b-a)^2}\right)$$ where I am using the wikipedia formula with $M_n=S_n/n$ and $t=n\epsilon$.

Claim 4 (General): If $X_i$ are not bounded and you don't know the distribution, all you can use is the Chebyshev inequality $$ P[|M_n-\mu|\geq \epsilon] \leq \frac{\sigma^2/n}{\epsilon^2}$$