What happens if we consider an algebra instead of $\sigma$-algebra in probability theory?

335 Views Asked by At

I understand the difference between algebra (of sets) and $\sigma$-algebra. But which are the implications if we use algebra instead of $\sigma$-algebra in probability theory? If it exists, could you please provide a small example?

2

There are 2 best solutions below

0
On

Essentially, as I understand it, the question is what if you define probability measure as done in textbooks such as the classic Introduction to Probability Theory by Hoel, Port, and Stone, but modify the part of the definition that relates to the sum of probabilities of disjoint sets as follows:

If $A_k$ for $k=1, 2, 3,\ldots, n$ are mutually disjoint sets in $\mathscr A$ (the domain of the probability measure), then $$P\left(\bigcup_{k=1}^n A_k\right) = \sum_{k=1}^n P(A_k).$$

The standard definition would not terminate the sequence of $A_k$, and it would replace the symbol $n$ with $\infty$ on both sides of the equation.

So what do we lose by this modification? If you will do all your probability on probability spaces with finite numbers of events, then I think there is no ill consequence of giving up the ability to add the probabilities of infinitely many mutually disjoint sets, as no such collection of sets exists in such a domain.

You would have to substantially rewrite a lot of probability theory, however. For example, in the aforementioned textbook by Hoel, Port, and Stone, Theorem 1 has already invoked the part of (their) definition of probability measure that defines the probability of a union of a countably infinite collection of sets. They don't even wait until the second theorem!

Some consequences for doing mathematics with continuous distributions are implied by this response to a closely related question.

Even for discrete distributions, I think you would get into trouble fairly quickly when you encounter domains that allow for infinitely many disjoint events of non-zero probability. For example, suppose we toss a fair coin until it comes up heads, and let $X$ be the number of tosses up to and including the first heads. If $A_k$ is the event that $X = k$, the obvious probability assignment is $P(A_k) = 2^{-k}.$ But it does not follow that the total probability of the (infinite) sequence of mutually disjoint events $A_1, A_2, \ldots$ is $1$, because we do not have a definition that allows that total probability to be stated.

Possibly we might work around that problem by using a discrete probability distribution function to define this probability space, that is, we somehow manage to assign probabilities to $P(X \geq k)$ and derive $P(X=k)$ from them (although it is not so clear how we justify the probabilities we assign). Even so, how do we compute $P(X \mbox{ is odd})$? This is simple with a $\sigma$-algebra:

$$P\left(\bigcup_{k=1}^\infty A_{2k-1}\right) = \sum_{k=1}^\infty P(A_{2k-1}) = \sum_{k=1}^\infty 2^{-(2k-1)} = \frac 23.$$

Without the $\sigma$-algebra, this is much more difficult (if it is possible at all). We simply have no justification to set the left side of the first equation equal to the right side.

0
On

A probability space, as you might have noted, is simply an event space, with an associated $\sigma$-algebra, and a probability measure.

Let's start with a probability measure: it is simply a measure re-scaled such that the measure over the support of the distribution is unity. So a probability measure is just a measure.

Now, what happens if we lower the restriction of a $\sigma$-algebra to an algebra? Well, we no longer really have a measure, we have a premeasure.

Now, a premeasure can induce an outer measure. Alternatively, we can take a measure on a $\sigma$-algebra generated by an algebra and restrict is such that it coincides with the premeasure.

So in essence your question becomes, "what does countable additivity give us in probability theory that finite additivity does not?" To answer that, we might simply explore the same problems that are encountered trying to develop a notion of measure on $\mathbb{R}^n$ with finite additivity. For example, the Banach-Tarski paradox. Moreover, we would have to establish new mathematics to deal with "difficult" probability distributions such as the Cantor distribution.

Countable additivity wipes away a lot of problems, and a lot of those problems are particularly noteworthy when attempting to develop a consistent theory of probability.