My book states the following theorem
Let $X$ be a random variable with sample space $\Omega$. If $F_1, F_2, . . . , F_r$ are events such that $F_i$ and $F_j $ are disjoint (for $i$ not equal to $j$) and $\Omega = \cup_jF_j$ then $E(X) = \sum_jE(X|F_j)P(F_j)$
I understand this to mean that given $X$ which can take on a value / outcome from $\Omega$ and $F_1, F_2, . . . , F_r$ are all pairwise disjoint whose union forms $\Omega$, $E(X)$ can be computed by the given equation. However, I had trouble understanding the example illustrating this theorem:
Let T be the number of rolls in a single play of craps. We can think of a single play as a two-stage process. The first stage consists of a single roll of a pair of dice. The play is over if this roll is a 2, 3, 7, 11, or 12. Otherwise, the player's point is established, and the second stage begins. This second stage consists of a sequence of rolls which ends when either the player's point or a 7 is rolled. We record the outcomes of this two-stage experiment using the random variables X and S, where X denotes the first roll, and S denotes the number of rolls in the second stage of the experiment (of course, S is sometimes equal to 0). Note that T = S + 1. Then $E(T) = \sum_{j=2}^{12} E(T|X = j)P(X=j)$
Here I think T is analogous to the X in the theorem above and each outcome of a roll of the two dices ($X = j$) is an event analogous to an $F_j$. However, the potential values of T can take on are in the set $\{1, 2, ... \infty\}$ while I think the set formed by the union of events $X = j$ for all $j$ is just $\{2,3..,12\}$. I'm confused about what the sample space $\Omega$ would be in this example since it appears the values of the random variable and event appear to be drawn from different sets.
This is called the "law of total expectation" and is similar to the "law of total probability." The sample space $\Omega$ is the set of all outcomes $\omega$ of the form: $$ \omega = (\mbox{first roll}, \mbox{sequence of other rolls})=(X, \mbox{sequence of other rolls})$$ This sample space can be partitioned into events where the first roll is 2, the first roll is 3, ..., the first roll is 12. So the partition is: $$ \Omega = \{X=2\} \cup \{X=3\}\cup\{X=4\}\cup...\cup\{X=12\}$$ We know that $X$ can only take values in the set $\{2, ..., 12\}$ and so the events $\{X=2\}, \{X=3\}, ..., \{X=12\}$ are indeed mutually exclusive and collectively exhaustive. The event $\{X=4\}$ contains all outcomes that start with a first roll of 4.
The individual events $\{X=i\}$ look like this: \begin{align} \{X=2\} &= \{(2)\}\\ \{X=3\} &= \{(3)\}\\ \{X=4\} &= \{(4, 4), (4, 2, 2, 5, 4), (4, 12, 5, 5, 7), (4, 8, 4), ...\} \end{align} and so on. The event $\{X=4\}$ has a countably infinite number of outcomes, but all of them are sequences that start with $4$ and end with either 4 or 7.
The events $\{X=2\}, \{X=3\}, \{X=7\}, \{X=11\}, \{X=12\}$ all contain just one outcome each and so we trivially have \begin{align} E[T|X=2]&=1\\ E[T|X=3]&=1\\ E[T|X=7]&=1\\ E[T|X=11]&=1\\ E[T|X=12]&=1 \end{align} On the other hand, $E[T|X=4]$ is equal to 1 plus the expected time to roll either a 4 or a 7.