How can we approximate the probability of a nontrivial set from mutually independent samples?

36 Views Asked by At

Let $(E,\mathcal E,\mu)$ be a probability space and $B\in\mathcal E$ with $\mu(B)>0$.

Given $n\in\mathbb N$ samples $x_1,\ldots,x_n$ drawn from $\mu$, the number of these samples lying in $B$ is $k:=\sum_{i=1}^n1_B(x_i)$. Assuming $n$ is large enough, how can we verify our intuition that $\mu(B)\approx k/n$?

Assuming $X_1,\ldots,X_n$ are mutually independent random variables distributed according to $\mu$, a related question seems to be whether the probability that $k\in\left\{1,\ldots,n\right\}$ of these random variables lie in $B$ is approximately $\mu(B)$, i.e. $\operatorname P\left[\sum_{i=1}^n1_B(X_i)=k\right]\approx\mu(B)$.

How can we very that?

2

There are 2 best solutions below

0
On BEST ANSWER

If $X_1,\dots,X_n$ are iid random variables then so are $Y_1,\dots,Y_n$ where $Y_i:=\mathbf1_B(X_i)$.

The $Y_i$ have Bernoulli-distribution with parameter $\mu(B)=P(X_1\in B)$.

Let $S_n:=\sum_{i=1}^nY_i$ and $\bar{S}_n=\frac1nS_n$.

The strong law of large numbers tells us that here:$$\bar{S}_n\stackrel{a.s.}{\to}\mathbb E Y_1=\mu(B)$$

Also observe that $S_n$ has binomial distribution with parameters $n$ and $\mu(B)$.

This tells us that for any fixed $k\in\{0,1,\dots,n\}$ we have: $$P(S_n=k)=\binom{n}{k}\mu(B)^k(1-\mu(B))^{n-k}$$We do not have something as $P(S_n=k)\approx\mu(B)$ for a specific integer $k$ (as you seem to suggest). In fact if $n$ grows then $P(S_n=k)\to0\neq\mu(B)$ for every $k$.

This also (stronger) in uniformity: $$n\to\infty\implies\max\{P(S_n=k)\mid k\in\{0,\dots,n\}\}\to 0<\mu(B)$$

0
On

Since $x_i$ are i.i.d. the strong law of large numbers (SLLN) can help. Let $P$ denote probability and $E$ denote expectation. The SLLN says that if $E(1_B(x_1))$ exists, then in any long experiment $[\sum_{i=1}^n 1_B(x_i)]/n$ is approximately equal to $E(1_B(x_1))$ (and nothing else). But $E(1_B(x_1))=P(x_1 \in B)=\mu(B)$.