Expected Value Of A Process - Formalization / Foundations

36 Views Asked by At

Consider the question: Let $X$ be the random variable describing the number of rolls of a six-sided die needed till you see a $6$. What is $\mathbb{E}(X)$? Usually the answer given is $6$. What is such a question asking, in a formal mathematical sense?

It seems you're taking the set of all sequences of $1$ to $6$, removing all of them that continue after a $6$ is seen, and then assigning a probability to each one and adding those probabilities. This method works in a countable case, because for an absolutely convergent series the order of addition doesn't matter.

But here we have uncountably many sequences of length $\infty$, because there are uncountably many sequences of just $1$ through $5$ that never hit a $6$. Each such sequence has probability $0$, but that doesn't necessitate that their sum be $0$. In the continuous case, the way you add probabilities matters. In the dice question we are not given a particular way to add the probabilities.

You could say: to formalize, don't consider any sequences that are not finite. Then you cannot correctly answer similar questions. For example, consider a $10$-sided fair dice. It has the surprising property that after each roll, it multiplies its sides by $10$. So, it becomes after the first roll a $100$-sided fair dice with numbers $1$ through $100$ on it. What is the expected number of rolls till you see a $1$? The chance you ever see a $1$ is $.111111.. = \frac{1}{9}$. So the expected number of rolls must be $\infty$. If you only consider finite sequences terminating with $1$, your answer will be wrong.

Also it's not clear that you can say that probability distributions must sum to $1$, so add up all the cases you can completely characterize in finitely many rolls, notice that the sums of their probabilities is $1$, and therefore conclude that any valid probability distribution must therefore assign probability $0$ to the set of non-finitely terminating rolls. This is because it's unclear what it means to completely characterize something in finitely many rolls, and it's not clear that such a method would resolve all the intuitively answerable questions about sequences of rolls.

So, is there a canonical, more-or-less axiomatic way to figure out what mathematical question (e.g., a particular infinite summation) problems of this kind are asking?

1

There are 1 best solutions below

0
On

If you wanted to translate the dice question into a formal mathematical statement, then you might do it like this:

Suppose we have a sequence $\{Z_i\}_{i\ge1}$ of iid random variables defined on some probability space $(\Omega,\mathcal F,\mathbb P)$ such that $\mathbb P(Z_1=k)=\frac16$ for $k=1,\ldots,6$. Let $X=\inf\{n:Z_n=6\}$. What is $\mathbb E(X)$?

Of course, this isn't the only way. You might even call it flawed - after all, how do we know that $X$ is measurable? In this case, measurability of $X$ is pretty trivial, but perhaps you would rather another approach. Okay. So let's skip the sequence $\{Z_i\}$ and jump straight into defining $X$ from some appropriate probability space $(\Omega,\mathcal F,\mathbb P)$.

One choice, which is what you have done, is to let $\Omega$ be the set of sequences in $\{1,\ldots,6\}$. You then need to choose your $\sigma$-field. Given that we do not care (and indeed, do not know) what happens after we finally roll a $6$, it would be inappropriate to call $\{x\}$ measurable for every sequence $x$; if $x=(1,3,2,3,6,5,4,6,4,\ldots)$ and $y=(1,3,2,3,6,6,6,6,\ldots)$, then for our purposes, $x$ and $y$ are the same. Thus, a suitable choice of $\mathcal F$ is the $\sigma$-field generated by all sets of the form $$A_{(x_1,\ldots,x_{n-1})}:=\{(x_1,\ldots,x_{n-1},6,y_{n+1},y_{n+2},\ldots):y_k\in\{1,\ldots,6\}\text{ for all }k>n\}$$ for some $x_1,\ldots,x_{n-1}\in\{1,\ldots,5\}$. (Note: this includes the choice $n=1$, for which we get $A_\emptyset:=\{(6,y_2,\ldots):y_k\in\{1,\ldots,6\}\text{ for all }k>1$}). Now to choose $\mathbb P$. It is not hard to show that this $\sigma$-field is countable, so it suffices to specify the action of $\mathbb P$ on the atoms of $\mathcal F$, which in this case are simply the $A_{(x_1,\ldots,x_{n-1})}$, and unsurprisingly we set $\mathbb P(A_{(x_1,\ldots,x_{n-1})}):=6^{-n}$. Now define $X$ by $X(x_1,x_2,\ldots)=\inf\{n:x_n=6\}$, and our set-up is done.

Is this way better? Not really. You still need to check $X$ is measurable (it's still pretty trivial, but in my opinion the notation makes it a little less transparent). Even with our careful choice of $\sigma$-field, there's still a hell of a lot of superfluous information. A better choice may have been $\Omega$ the set of sequences in $\{6,N\}$ (where $N$ stands for 'Not $6$') and do a similar set-up, only now our atoms are $$A_n=\{(x_1,x_2,\ldots,x_{n-1},6,y_{n+1},\ldots):x_k=N\text{ for all }k<n,\,y_k\in\{6,N\}\text{ for all }k>n\}$$ and we have $\mathbb P(A_n)=(\frac56)^{n-1}\frac16$. This is probably a little better, but if your problem is the lack of applicability of this sample space to other problems, then you're still going to be unsatisfied.

This is why, in many cases, probabilists prefer to deal with random variables rather than explicit spaces. We have sample space $(\Omega,\mathcal F,\mathbb P)$ and a bunch of random variables defined on it with all of the dependence or independence we like. What exactly is $\Omega$? Who cares! Think of it as the unit interval with the Lebesgue measure if you like, or don't think of it at all. I would say that my initial mathematical statement of your problem is simple, quick and easy to write, and intuitive to understand, and yet it contains just as much mathematical information as the more laborious sample space stuff later.

As an added advantage, you could even in a pinch use the same space for multiple problems. You do have to be a little careful here, since it's basically impossible to definitely state what the correlations between the random variables should be (example: the original dice roll, plus the modified one you mentioned. What is the probability that the third roll in the first problem is a $4$ and the fifth roll in the second problem is a $134$? Wait, what?). However, since we suppress our dependence on the probability space, it really doesn't matter if you change probability space when moving to a new problem, even without mentioning it. No one will ever know!