How does the frequentist definition of probability work with non-measurable sets?

242 Views Asked by At

Let $E$ be a subset of $[0,1]$, and let us try to create a measure of $E$ as follows. Let $x_1,x_2,...$ be a sequence of independent random real numbers picked using a uniform probability distribution on the interval $[0,1]$, and let us send each $x_n$ through an oracle which tells us whether $x_n\in E$ or not. And then let us define the measure of $E$ to be the limit of the fraction of the first $n$ elements which are in $E$, as $n$ goes to infinity.

My question is, for what sets $E$ does this limit exist? Does it exist for all sets $E$, or only for Lebesgue-measurable sets, or what? What exactly happens when you try the procedure for non-Lebesgue measurable sets? Will the limit just diverge?

1

There are 1 best solutions below

9
On BEST ANSWER

Initial answer: I don't have a rigorous answer for you, but let me point out that every subset of $[0,1]$ (measurable or not) has a rigorously defined inner and outer Lebesgue measure, both of which belong to $[0,1]$ and satisfy the expected inequality. See wikipedia and links therein and related posts on this site.

I believe that the sequence you asked about will generally have liminf equal to the inner measure of $E$ and limsup equal to the outer measure of $E$. So for a non-measurable set where the inner and outer measures differ, the sequence will not converge to any particular value.


Complete answer after discussion in comments: This paper posted by Keshav in the comments below provides a resolution of this question, which also happens to show the following:

  1. My conjecture that the liminf and limsup equal the inner and outer measures is incorrect.

  2. My second sentence "So for a non-measurable set where the inner and outer measures differ, the sequence will not converge to any particular value." is however correct.

Both statements follow from Theorem 1.3 in the linked paper. Indeed, the sequence of indicator functions $X_i=1_{x_i\in E}$ satisfy the paper's definition of an "iidpnmrvs" sequence (i.e., a sequence of iid possibly non-measurable random variables) and moreover the lower and upper expectations as defined in the paper coincides with the inner and outer measures of $E$. Thus the theorem applies, and if we assume that $E$ is non-measurable, it follows that the lower and upper expectations are non-equal and therefore the theorem shows that all the events we are interested in are "maximally non-measurable", meaning that neither do they occur "with probability $1$", nor with "probability $0$" nor can we get any bounds on the probability - the set is too badly behaved (its inner measure is $0$ and its outer measure is $1$).

To see that this contradicts my conjecture (1), we can take the set $A$ in theorem 1.3 to be the singleton value $\{m_*(E)\}$ in part (i) of the theorem, and then take $A$ to be the singleton value $\{m^*(E)\}$ in part (ii). In both cases the theorem says that the event that the corresponding limit equals the corresponding measure does not have probability $1$ (which is what my conjecture says, although I was not precise enough to include the "almost surely" in my statement).

To see that my conjectural statement (2) is correct, apply the theorem 1.3 with $A=\{a\}$ where $a$ is any deterministic element of $[m_*(E),m^*(E)]$. Then the theorem shows that the event that the sequence converges to $a$ (which is the intersection of the events in parts (i) and (ii) of the theorem) does not have probability $1$, which means that the sequence cannot converge to any deterministic a.s. limit.