Probability seems not a consistent measure of the strangeness of an outcome.
Suppose we have a discrete distribution corresponding to three possible outcomes:
A: 0.1, B: 0.6, C: 0.3
Then the occurrence of A is most strange, since it has the lowest probability.
Check another discrete distribution over 1000 possible outcomes:
A: 0.1, B:0.00001, C:0.0000001, D: ...
In this distribution, A still has probability 0.1, but it will be the most likely outcome if all other outcomes are associated with smaller probabilities.
Currently I need a consistent measurement of how strange an outcome is. My simple solution is the following formula,
$\text{strangeness}(A)=1-\frac{p(A)}{\max_{X \text{ is a possible out come}} {p(X)}}$
As a result, the outcome with highest probability in the distribution will have zero strangeness. The strangeness still ranges from $0$ to $1$. In the first distribution, the strangeness of three outcomes are:
A:$\frac{5}{6}$, B:$0$, C:$\frac{1}{2}$
However, I feel this definition is quite heuristic. I am asking if there is better mathematical concept to model such "strangeness". Thank you!
In high energy physics, we have a need to determine whether a result is "significant," meaning that it is unexpected given some underlying model of what the probabilities of various events should be. (For example, we ask the question "does is the data from this experiment comfortably consistent with the standard model, or have we encountered something unexplained by that model?")
Say we decide that a $0.001$ probability of finding this data under the standard model is strange enough to get excited about.
In assessing significance, we have to be very careful to consider the "look elsewhere effect". For example, if the circumstance to possibly get excited about is a "bump" at some value of mass among a spectrum of masses measured in many particle collisions, and if Poisson statistics tells you that the chance of such a bump in that mass bin is one in five thousand, that will still not be a significant event because we would have been just excited it that same bump had happened at any of a hundred other mass bins -- the probability becomes one in fifty, which is not significant enough to report.
In your simple cases, I would say the analogous question to ask is to say that the non-strangeness of an event is the probability of the union of that event plus all other equally or more unlikely events. So if your model works with positive integers and says that $$ \forall n \in $\Bbb{Z}^+ : P(X = n) = 9\cdot (0.1)^n $$ then a result of $n=4$, which has a probability $0.0009$, has a non-strangeness of $0.001$, which may well be considered as a strange event. But if $$ \forall n \in $\Bbb{Z}^+ : P(X = n) = \frac{1}{99}\cdot (0.99)^n $$
then a result of $n=300$, which has a probability of about $0.0005$ (much lower that in our previous example), has a non-strangeness, by this measure, of $0.0495$ which if we consider our strangeness threshold to be $0.001$, is not a particularly strange or remarkable event.