Proof of (a step in the proof of) the Law of Large Numbers

102 Views Asked by At

Theorem: Let $f:[0,1] \to \mathbb R$ be a measurable function bounded by $c$. Let $U_1,U_2,\ldots,U_n$ be i.i.d. and Uniform$(0,1)$. Then: $$ P \left( \left\lvert \frac{1}{n}\sum_{k=1}^nf(U_k)-\int_0^1f(x)dx \right\rvert \ge \epsilon \right) \le \frac{c^2}{n\epsilon^2} $$

My attempt at Proof: Since $U_1,...U_n$ are independent, then $f(U_i)~~ \bot ~~f(U_j)~~\forall i \neq j$. Further,the pdf of each uniform is : $f_{U_i}(t) = 1$. Thus: $$ E_{U_1}[f(x)] = \int_{\Omega}f(x) f_{U_1}(x)dx = \int_0^1f(x)dx $$ Also note that $\lvert f(x) \rvert \le c \implies f^2(x) \le c^2$, and so: $E[f^2(x)] \le E[c^2] =c^2$ by the monotonicity of expectation. Therefore:

$$ Var (f(x)) = E[f^2(x)]-[E(f(x)]^2 \le c^2 $$

Now, letting $Y_k = f(U_k)$, then $Y_1,Y_2,...$ are independent random variables in $L^2$ with bounded variance, in that $v=sup_{k \ge 1}Var(Y_k)\le c^2 < \infty$.

By the Weak Law of Large Numbers:

\begin{align*} P& \left( \bigg{\lvert} \frac{1}{n}\sum_{k=1}^nf(U_k)-\int_0^1f(x)dx \bigg{\rvert} \ge \epsilon \right)\\ =&P \left( \bigg{\lvert} \frac{1}{n}\sum_{k=1}^nY_k-E[Y_k] ~\bigg{\rvert} \ge \epsilon \right)\\ =&P \left( \bigg{\lvert} \frac{1}{n}\sum_{k=1}^nY_k-E[Y_1] ~\bigg{\rvert}\ge \epsilon \right)\\ &\le \frac{v}{n \epsilon^2}\\ &\le \frac{c^2}{n \epsilon^2} \end{align*} Where the third equality follows from the fact that $f(U_1),f(U_2),...$ all follow the same distribution. The last equality follows from the boundedness of the variance.

Does this prove the statement rigorously? Is there anything missing?

1

There are 1 best solutions below

3
On BEST ANSWER

You got indeed the idea, and your proof looks correct. Some comments:

  • In the second line of your attempt, the expression of the p.d.f. of a uniform distribution in the unit interval is not correct if $t\notin [0,1]$.

  • How do you define the random variable $f(x)$? Do you mean $f(U_1)$?

  • Actually, you don't need to use the weak law of large numbers, but rather a step of its proof, namely Markov's inequality.

  • I would switch the second and third lines since in the second one, it is clearer that the random variables are centered. The point is that after having applied Markov's inequality you then have to compute $\mathbb E\left[\left(\sum_{j=1}^n(f(X_j))-\mathbb E[f(X_1) ]\right)^2\right]$, and if you had had $\mathbb E\left[\left(\sum_{j=1}^n(f(X_j)-\mathbb E[f(X_j) ])\right)^2\right]$, this would have been neater.