I am having trouble understanding wikipedia's definition of filtration in probability theory:
Definition "filtration"
- Let $(\Omega ,\mathcal {A}, P)$ be a probability space
Let $I$ be a totally ordered index set
Then $\mathbb{F} = \bigl(\mathcal{F}_i\bigr)_{i\in I}$ is a filtration if every $\mathcal{F}_i$ is a sub-$\sigma$-algebra of $\mathcal{A}$ and for all $m,n \in I \times I$ we have $\mathcal{F}_m \subseteq \mathcal{F}_n$ whenever $m \le n$
$~ \square$
This definition feels unintuitive and I would have thought the direction of containment would go in the other direction: $\mathcal{A} = \mathcal{F}_0 \supseteq \mathcal{F}_m \supseteq \mathcal{F}_n$.
After all, the set of possible outcomes becomes smaller as one observes longer prefixes and more of the process that's unfolding.
For concreteness, can someone provide an example of what the structure of the sets $\mathcal{F}_n$ look like? Say for a sequence of Bernoulli RVs, $X_1, X_2, X_3, \ldots$, what do $\mathcal{F}_0$, $\mathcal{F}_2$, $\mathcal{F}_2,$ contain? And what does $\mathcal{A}$ contain?
${\cal F}_n$ makes all of the distinctions that ${\cal F}_m$ makes, for $m\leq n$, but may also make finer-grained distinctions. That is, ${\cal F}_n$ may contain sets that are subsets of the smallest sets in ${\cal F}_m$, along with all of the sets in ${\cal F}_m$. That's why ${\cal F}_m \subseteq {\cal F}_n$: the number of possible outcomes--subsets--becomes larger, but (some of) the smallest of them are smaller (or may be, since it's a $\subseteq$ relationship).
For example, suppose that we toss a coin three times, recording 1 for heads and 0 for tails, so the atomic outcomes are the eight triples of 1's and 0's. I'll describe a filtration on this space induced by the three tosses, with $*$ to represent that any possible outcome is allowed for some elements in a triple. That is, I'll use $\langle 0,1,* \rangle$, for example, to refer to the set of all sequences of toss outcomes in which the first toss comes up tails, the second one comes up heads, and the third toss has either outcome. (Note that I am abusing notation a bit to represent a set, not a sequence per se. I'm not distinguishing between what are called atoms and the singleton sets containing them.)
To create each stage of the filtration, I will start with some of these atomic outcomes and create an algebra that contains those atomic outcomes plus all possible unions of them, along with the empty set. (That is, we take the set of atomic outcomes and then take its closure under unions.)
For the first toss, the atomic outcomes are
$$A_0=\langle 0, \ast , * \rangle, \; A_1=\langle 1, * , * \rangle \;,$$
and the full algebra of outcomes is
$${\cal F}_1 = \{ A_0, A_1, A_0 \cup A_1, \varnothing \} = \{ \langle 0, \ast , \ast \rangle, \langle 1, \ast , \ast \rangle, \langle 0, \ast , \ast \rangle \cup \langle 1, \ast , \ast \rangle, \varnothing \}\;. $$
For the first and second tosses, the atomic outcomes are
$$A_{00}=\langle 0,0, * \rangle, \; A_{01}=\langle 0,1, * \rangle, \; A_{10}=\langle 1,0, * \rangle, \; A_{11}=\langle 1,1, * \rangle \;.$$
The second element in the filtration, ${\cal F}_2$, contains these four outcomes plus all of their possible unions and the empty set. This means that ${\cal F}_2$ contains sixteen sets, and I won't list them all. Note that the notation I used for ${\cal F}_1$ just summarizes some of the notation I'm using for ${\cal F}_2$: $A_0=A_{00}\cup A_{01}$ and $A_1=A_{10}\cup A_{11}$. So all of the outcomes in ${\cal F}_1$ are included in ${\cal F}_2$, and therefore ${\cal F}_1 \subseteq {\cal F}_2$.
The atomic outcomes for all three tosses will correspond to the $2^3=8$ possible three-element sequences of 0 and 1, which we can represent as
$$A_{000}=\langle 0,0,0\rangle, \; A_{001}=\langle 0,0,1\rangle, \; A_{010}=\langle 0,1,0\rangle, \;\ldots,\; A_{111}=\langle 1,1,1\rangle \;.$$
(I'm still abusing notation: those are sets.)
The third algebra in the filtration, ${\cal F}_3$, consists of these eight sets plus all possible unions of them, along with the empty set. There are 64 such sets in ${\cal F}_3$. Note that again some of these unions are equal to the atomic sets defined for ${\cal F}_2$ : $A_{00} = A_{000}\cup A_{001}$, and so on. So ${\cal F}_3$ includes all of the outcomes in ${\cal F}_2$, and ${\cal F}_2 \subseteq {\cal F}_3$.
(By the way, the trick of starting with some smallest, atomic sets and then creating the algebras by taking unions is one that works when each of the original random variables can take a finite or countably infinite set of values. When the random variables have continuous, uncountably infinite, sets of values, it's usually necessary to start with specific larger sets and then form algebras using both union and intersection. That's a topic for other questions, which have no doubt been asked and answered.)