Law of large numbers and coin tosses

391 Views Asked by At

The law of large numbers is the following statement:

$\lim_{n \rightarrow\infty}(P(|S_n/n - \mu| < \epsilon))=1$, for all $\epsilon>0$

Where $S_n = X_1 + X_2 + ...+X_n$ iid random variables. For this example, suppose $X_i = 1$ if the toss outcome was heads, and $0$ otherwise, with $P(X = 1) = P(X=0) = 0.5$.

Suppose I flip a coin a large (but finite) number of times, like $n=10,000.$ Does the law of large numbers make it highly probable that the number of heads I get will deviate from $n/2 = 5,000$ by at most $k$, where $k << n$?

For example, does it specify a lower bound on the expression below?

$P(|S_{10000} - 10,000/2| < 100) \geq ?$

Or does it say nothing about large but finite $n$?

EDIT:

This is where I reached:

$\lim_{n \rightarrow\infty}P(|S_{n}/n - 0.5| < \epsilon) = 1$

$\lim_{n \rightarrow\infty}P(|nS_{n}/n - 0.5n| < n\epsilon) = 1$

$\lim_{n \rightarrow\infty}P(|S_{n} - 0.5n| < n\epsilon) = 1$, for all $\epsilon>0$

2

There are 2 best solutions below

3
On BEST ANSWER

The weak law of large numbers essentially only states that the random variable $S_n/n$, describing the average number of heads obtained, tends to its expected value $E\left[\frac{S_n}n\right]=\mu=0.5$ in probability. Graphically, this means that as $n$ grows, the PMF of $S_n/n$ shrinks and becomes increasingly localized around $0.5$.

For the asymptotic behaviour of $S_n$, you have to refer to the Central Limit Theorem which states that the sum of $n$ independent, identically distributed random variables $X_i;i=1,2,3,...$ is asymptotically normal under certain conditions. Also note that in your example, $S_n$ is actually a binomial distribution. The special case of CLT that deals with binomial distributions is the De Moivre-Laplace Theorem.

For large but finite $n$, you may approximate $S_n$ by a normal distribution, in accordance with the CLT. For lower bounds on $P(|S_n-E[S_n]|<k\sqrt{V[S_n]})$ for all positive $k$, the Chebychev's inequality comes to our aid. From the Chebychev's inequality, we see$$P(|S_n-n/2|<\frac{k\sqrt n}2)\ge1-\frac1{k^2}$$or$$P(|S_n-n/2|<m)\ge1-\frac n{4m^2}$$

0
On

You are asking about the rate of convergence of the laws of large numbers, which is the subject of large deviations.

Assume for simplicity that $\mu = 0$ and suppose that $\phi(t) = \mathbb{E}[e^{t X_1}]$ is finite for all $t \in \mathbb{R}$. This condition is not necessary, but is illustrative. The normal distribution satisfies this requirement.

Define the rate function $$I(\beta) = \inf_{t \in \mathbb{R}} \{ t\beta - \log\phi(t)\}.$$ Then, under appropriate conditions, $$\lim_{n\rightarrow\infty} \frac{1}{n}\log P(|S_n/n| \geq \varepsilon) = -I(\varepsilon).$$ For your example, $X_i$ are Bernoulli with parameter $p$ and $I(\beta) = \beta \log(\beta/p) + (1-\beta)\log((1-\beta)/(1-p))$ for $\beta \geq 0$, and $I(\beta) = +\infty$ otherwise. Note that $I(p) = 0$. A crude LDP approximation would be \begin{align*} P(|S_{10000} - 5000| >100 ) &= P\bigg(\bigg|\frac{S_{10000}}{10000} - 0.5\bigg| > 0.01\bigg)\\ &\approx e^{- 10000 I(0.01)}. \end{align*} For $p=0.5$, $I(\beta)$ for Bernoullis evaluates to roughly $0.64$ if I used a calculator correctly. This does not mean the probability is upper or lower bounded by $e^{-10000\times 0.64}$, only that the exponential rate of decay is $0.64$. Indeed, whether $P(\cdot) = e^{-n I(x)}$ or $ P (\cdot) = e^{-n I(x) + C\sqrt{n}}$, $I(x)$ is unchanged.

Edit: You can strengthen Shubham's results by using exponentials instead of squares.