Why are the probability/Kolmogorov axioms called "axioms"?

117 Views Asked by At

Question

I don't understand why the three axioms of probability (Kolmogorov axioms) are treated as axioms, not as definitions. Perhaps it hints at a larger problem: I don't understand why any axiom is treated as an axiom instead of a definition in any mathematical system.

Isn't one of the reasons the three axioms are set up as such the fact that most mathematicians agree that letting probabilities fall between $0$ and $1$ is logical and easier to work with, as opposed to, for example, letting them fall between $0$ and $-100$? Is it wrong to say that, therefore, we are simply letting––i.e., defining––the lowest probability be $0$ with the first axiom $P(E) \ge 0$? That is, $P(E) \ge 0$ is not something that we assume to be true, but is just what we are working with?

Aside from seeking an answer, I have added below my own explanation to the question which I would like to have checked. I appreciate your time in reading my prose!


My Thoughts (Please Check)

As we repeat an experiment many, many times, we observe empirically that some outcomes uncoincidentally/consistently occur more frequently than other outcomes. Additionally, we may observe that the relative frequency of each outcome seems to uncoincidentally be close to some constant every instance we repeat the experiment many times. This leads us to believe that there is some fixed underlying degree of likelihood for every outcome (that perhaps God assigned) and, more generally, for every event. Since numbers describe the world well in every aspect, it is reasonable to think that this degree of likelihood is a number (or at least maps to a number). In summary, we believe there is a number that describes the degree of likelihood of each event. We call this number probability (definition).

Now, we wish to find this number that is probability. But where do we start? Perhaps we can start by finding the range of numbers that probability must be in? But how do we do this? Intuitively, likelihood can be of three categories: "will never happen", "may happen", and "will always happen". The first and last category naturally form the boundaries of likelihood. Thus, it may be easier to analyze one of the boundaries first, namely, the "will always happen" category.

We are interested in "What is the number/probability assigned to any event that will 100% happen?". The thing is, we don't/can't know. But if we don't know, we can't make any progress. Well, why don't we just assume that the probability of a sure-event is $1$? Many people answer "sure!" (for various reasons we shall omit) and that is why we have the second axiom $P(S)=1$ where $S$ is a sample space.

Here is the key point: the reason that $P(S)=1$ is an axiom––an assumption––and not a definition is because of the assumption that there is a certain number assigned to a sure-event, that there is a universal truth to $P(S)$ (that perhaps God assigned). Since we can't really ascertain what the number actually is, we are left to make a reasonable assumption, which in this case is $P(S)=1$.