Explanation of a particular step in the proof of law of large numbers

47 Views Asked by Bumbble Comm At 16 May 2026 - 11:01

I am self-learning probability theory from Probability and Random processes by Grimmett, Stirzaker. I don't follow a particular step in the proof of the law of large numbers. I reproduce the proof here, for completeness along with my question.

The law of large numbers.

It is the case that $\frac{S_n}{n}$ converges to $p$ as $n \to \infty$ in the sense that, for all $\epsilon > 0$,

$$\mathbb{P}\left(p - \epsilon \le \frac{S_n}{n} \le p + \epsilon\right) \to 1$$

as $n \to \infty$.

There are certain technicalities involved in the study of the convergence of random variables and this is the reason for the careful statement of the theorem. For the time being, we encourage the reader to interprete the theorem as asserting simply that the proportion $\frac{S_n}{n}$ of the times that the event $A$ occurs converges as $n \to \infty$ to their common probability $p$. Interpreted in terms of tosses of a fair coin, the theorem implies that the proportion of heads is (with large probability) near to $\frac{1}{2}$.

Proof.

Pick an arbitrary small positive real $\epsilon > 0$. We have,

\begin{align*} \mathbb{P}(S_n \ge n(p + \epsilon)) &= \sum_{k \ge n(p + \epsilon)} P(S_n = k) \\ &= \sum_{k = m}^{n} {n \choose k} p^k (1 - p)^{n - k} \tag{1} \end{align*}

where $m$ is the least integer greater than or equal to $n(p+\epsilon)$. The following argument is standard in probability theory. Let $\lambda > 0$ and note that, if $k \ge m$,

\begin{align*} \lambda k &\ge \lambda n (p + \epsilon) \\ e^{\lambda k} &\ge e^{\lambda n (p + \epsilon)} \end{align*}

So, we have:

\begin{align*} \mathbb{P}(S_n \ge n(p + \epsilon)) &= \sum_{k = m}^{n} {n \choose k} p^k (1 - p)^{n - k} \\ &\le \sum_{k = m}^{n} \{e^{\lambda k - \lambda n (p + \epsilon)}\} {n \choose k} p^k (1 - p)^{n - k} \\ &= e^{-\lambda n \epsilon} \sum_{k = m}^{n} \{e^{\lambda k - \lambda k p - \lambda n p + \lambda kp}\}{n \choose k} p^k (1 - p)^{n - k} \\ &\le e^{-\lambda n \epsilon} \sum_{k = 0}^{n} {n \choose k} (pe^{\lambda q})^k (qe^{-\lambda p})^{n -k}\\ &= e^{-\lambda n \epsilon} (pe^{\lambda q} + qe^{-\lambda p})^n \tag{2} \end{align*}

We are interested to use a rather simple inequality here:

Claim. $e^x \le x + e^{x^2}$

Verification: \begin{align*} e^x &= \sum_{n=0}^{\infty} \frac{x^n}{n!} = 1 + x + \sum_{n=2}^{\infty} \frac{x^n}{n!}\\ &\le 1 + x + \sum_{n=2}^{\infty} \frac{(x^2)^n}{n!}\\ &\le 1 + x + \sum_{n=1}^{\infty} \frac{(x^2)^n}{n!} = x + \sum_{n=0}^{\infty}\frac{(x^2)^n}{n!} \\ &= x + e^{x^2} \tag{3} \end{align*}

With the aid of this inequality, we obtain:

\begin{align*} \mathbb{P}(S_n \ge n(p + \epsilon)) &\le e^{-\lambda n \epsilon} (p\lambda q + pe^{\lambda^2 q^2} - q\lambda p + qe^{\lambda^2 p^2})^n \\ &= e^{-\lambda n \epsilon} \left[pe^{\lambda^2 q^2} + q e^{\lambda^2 p^2}\right]^n \tag{4}\\ &\le e^{\lambda^2 n - \lambda n \epsilon} \tag{5} \end{align*}

We can pick $\lambda$ to minimize the right hand side, namely $\lambda = \frac{\epsilon}{2}$, giving:

\begin{align*} \mathbb{P}(S_n \ge n(p + \epsilon)) &\le e^{-\frac{1}{4}n\epsilon^2} \tag{6} \end{align*}

The above inequality is known as the Bernstein's inequality. It follows immediately that $P(S_n/n \ge p + \epsilon) \to 0$ as $n \to \infty$. An exactly analogous argument shows that $\mathbb{P}(S_n/n \le p - \epsilon) \to 0$ as $n \to \infty$. Thus, the theorem is proved.

Question. How did the author magically go from $(4)$ to $(5)$?

Original Q&A

Explanation of a particular step in the proof of law of large numbers

Related Questions in PROBABILITY-THEORY

Related Questions in LAW-OF-LARGE-NUMBERS

Trending Questions

Popular # Hahtags

Popular Questions