How can I show that the "binary digit maps" $b_i : [0,1) \to \{0,1\}$ are i.i.d. Bernoulli random variables?

1.4k Views Asked by At

In this post What is the Lebesgue measure of the set of numbers in $[0,1]$ that has two thirds of ones in their infinite base-2 expansion? we needed the fact that if we let $b_i (x) \in \{0,1\}$ for $x \in [0,1)$ denote the $i$-th digit in the infinite binary expansion of $x$, i.e.

$$ x = \sum_{n=1}^\infty b_n (x) \cdot 2^{-n}, $$

where $b_n (x) = 1$ for all but finitely many $n$ (except for $x=0$), then the $(b_n)_n$ form a family of independent Bernoulli random variables, if they are considered as maps defined on the probability space $([0,1), \mathcal{B}, \lambda)$, where $\mathcal{B}$ are the Borel sets and $\lambda$ denotes Lebesgue measure.

A quick search did not turn up any useful results, so my question is for a nice, more or less elementary, proof of this result, which can also be used for future reference.

I will "answer my own question", i.e. provide one possible proof. But I would also be interested in other proofs!

3

There are 3 best solutions below

0
On BEST ANSWER

Let $N\in\mathbb{N}$ be arbitrary. It suffices to show that $\left(b_{1},\dots,b_{N}\right)$ is an $N$-tuple of independent Bernoulli variables. For each $n\in\left\{ 1,\dots,N\right\} $, we have $$ \left[0,1\right)=\biguplus_{\ell=1}^{2^{n}}\left[\left(\ell-1\right)\cdot2^{-n},\ell\cdot2^{-n}\right) $$ and the random variable $b_{n}$ is constant on each of the intervals $$ \left[\left(\ell-1\right)\cdot2^{-n},\ell\cdot2^{-n}\right)=\biguplus_{j=\left(\ell-1\right)\cdot2^{N-n}}^{\ell\cdot2^{N-n}}\left[\left(j-1\right)\cdot2^{-N},j\cdot2^{-N}\right). $$ More precisely, we have $$ b_{n}\equiv\left(\left(\ell-1\right)\,{\rm mod}\,2\right)\qquad\text{ on }\left[\left(\ell-1\right)\cdot2^{-n},\ell\cdot2^{-n}\right)\qquad\left(\dagger\right) $$ and hence $$ b_{n}\equiv\left(\left(\ell-1\right)\,\mod\,2\right)\qquad\text{ on }\left[\left(j-1\right)\cdot2^{-N},j\cdot2^{-N}\right)\qquad\text{ if }\left(\ell-1\right)\cdot2^{N-n}\leq j\leq\ell\cdot2^{N-n}. $$

Observe the for each $j\in\left\{ 1,\dots,2^{N}\right\} $, there is some $\ell\in\left\{ 1,\dots,2^{n}\right\} $ such that $\left(\ell-1\right)\cdot2^{N-n}\leq j\leq\ell\cdot2^{N-n}$. This easily implies that the map $$ f:\left(b_{i},\dots,b_{N}\right):\left[0,1\right)\to\left\{ 0,1\right\} ^{N} $$ is constant on each of the $2^{N}$ intervals $\left[\left(j-1\right)\cdot2^{-N},j\cdot2^{-N}\right)$ for $j\in\left\{ 1,\dots,2^{N}\right\} $.

We finally observe that $f$ is surjective, because for $k:=\left(k_{1},\dots,k_{N}\right)\in\left\{ 0,1\right\} ^{N}$ arbitrary, we can set $$ x:=\sum_{\ell=1}^{N}k_{\ell}2^{-\ell}+\sum_{\ell=N+1000}^{\infty}2^{-\ell}\in\left[0,1\right). $$ Then the right hand side of the above equation directly yields the infinite binary expansion of $x$ and hence $b_{n}\left(x\right)=k_{n}$ for all $n\in\left\{ 1,\dots,N\right\} $, which means $f\left(x\right)=k$.

As the cardinality $\left|\left\{ 0,1\right\} ^{N}\right|=2^{N}$ is equal to the cardinality $$ 2^{N}=\left|\left\{ 1,\dots,2^{N}\right\} \right|=\left|\left\{ \left[\left(j-1\right)\cdot2^{-N},j\cdot2^{-N}\right)\,\mid\, j\in\left\{ 1,\dots,2^{N}\right\} \right\} \right|, $$ this implies that for each $k=\left(k_{1},\dots,k_{N}\right)\in\left\{ 0,1\right\} ^{N}$, there is a unique $j_{k}\in\left\{ 1,\dots,2^{N}\right\} $ such that $f\equiv k$ on $\left[\left(j_{k}-1\right)\cdot2^{-N},j_{k}\cdot2^{-N}\right)$. But this ensures \begin{eqnarray*} \lambda\left(\left\{ x\in\left[0,1\right)\,\mid\, b_{1}\left(x\right)=k_{1},\dots,b_{N}\left(x\right)=k_{N}\right\} \right) & = & \lambda\left(\left[\left(j_{k}-1\right)\cdot2^{-N},j_{k}\cdot2^{-N}\right)\right)\\ & = & 2^{-N}\\ & = & \prod_{j=1}^{N}\lambda\left(\left\{ x\in\left[0,1\right)\,\mid\, b_{j}\left(x\right)=k_{j}\right\} \right), \qquad (1) \end{eqnarray*} where the last equality is due to $\left(\dagger\right)$, which implies $$ \left\{ x\in\left[0,1\right)\,\mid\, b_{j}\left(x\right)=0\right\} =\biguplus_{\ell=0}^{2^{j-1}-1}\left[2\ell\cdot2^{-j},\left(2\ell+1\right)\cdot2^{-j}\right) $$ and hence $$ \lambda\left(\left\{ x\in\left[0,1\right)\,\mid\, b_{j}\left(x\right)=0\right\} \right)=\sum_{\ell=0}^{2^{j-1}-1}\lambda\left(\left[2\ell\cdot2^{-j},\left(2\ell+1\right)\cdot2^{-j}\right)\right)=2^{-j}\cdot2^{j-1}=\frac{1}{2} \qquad (2) $$ as well as $$ \lambda\left(\left\{ x\in\left[0,1\right)\,\mid\, b_{j}\left(x\right)=1\right\} \right)=1-\lambda\left(\left\{ x\in\left[0,1\right)\,\mid\, b_{j}\left(x\right)=0\right\} \right)=\frac{1}{2}. \qquad (3) $$ In summary, equations (1)-(3) establish the claim.

4
On

Here is a nice proof using Viete's theorem and Rademacher functions. I won't reproduce it here but you may find it interesting?

0
On

The infinite binary expansion means the non-terminating binary expansion, i.e., when there exist two expansions for $x$, one a non-terminating expansion $0.\ldots 01111\cdots$ with $1$ trailing forever, and the equivalent expansion $0.\ldots 10000\cdots$ which terminates, choose the former. (You don't want the condition "$b_i(x)=1$ for all but finitely many $n$", which is satisfied for only a countable number of $x$.) With this in mind, the interval $(0,1]$ is a more natural domain for the $b_i$'s.


Let $e_1,e_2,\ldots,e_n\in\{0,1\}$. The set $$\{b_1=e_1,b_2=e_2,\ldots,b_n=e_n\}:=\bigcap_{k=1}^n\{x\in(0,1]:b_k(x)=e_k\}\tag1$$ consists of all $x$ whose binary expansion begins $0.e_1e_2\cdots e_n$. This set has measure $2^{-n}$, since it coincides with the interval $(d, d+2^{-n}]$ whose left endpoint is the dyadic rational $d:=\sum_i^n e_i/2^i$. (Or prove by symmetry: all $2^n$ possible initial expansions $0.e_1\cdots e_n$ of $n$ elements from $\{0,1\}$ must have equal measure.) Hence $$\lambda(b_1=e_1,b_2=e_2,\ldots,b_n=e_n)=\frac1{2^n}.\tag2$$ By representing the set $\{b_k=e_k\}$ as a disjoint union of sets of the form (1), conclude for each $k$ $$\lambda(b_k=e_k) = \frac12.\tag3$$ Combining (2) and (3) gives, for each $e_1,\ldots,e_n$, $$\lambda(b_1=e_1,b_2=e_2,\ldots,b_n=e_n)=\prod_{k=1}^n \lambda(b_k=e_k).\tag4$$ So (4) is saying that $b_1,\ldots, b_n$ are independent, while (3) is saying each $b_k$ has Bernoulli($1/2$) distribution.