Is there any proper mathematical concept that measures the strangeness of an event.

106 Views Asked by At

Probability seems not a consistent measure of the strangeness of an outcome.

Suppose we have a discrete distribution corresponding to three possible outcomes:

A: 0.1, B: 0.6, C: 0.3

Then the occurrence of A is most strange, since it has the lowest probability.

Check another discrete distribution over 1000 possible outcomes:

A: 0.1, B:0.00001, C:0.0000001, D: ...

In this distribution, A still has probability 0.1, but it will be the most likely outcome if all other outcomes are associated with smaller probabilities.

Currently I need a consistent measurement of how strange an outcome is. My simple solution is the following formula,

$\text{strangeness}(A)=1-\frac{p(A)}{\max_{X \text{ is a possible out come}} {p(X)}}$

As a result, the outcome with highest probability in the distribution will have zero strangeness. The strangeness still ranges from $0$ to $1$. In the first distribution, the strangeness of three outcomes are:

A:$\frac{5}{6}$, B:$0$, C:$\frac{1}{2}$

However, I feel this definition is quite heuristic. I am asking if there is better mathematical concept to model such "strangeness". Thank you!

3

There are 3 best solutions below

0
On BEST ANSWER

In high energy physics, we have a need to determine whether a result is "significant," meaning that it is unexpected given some underlying model of what the probabilities of various events should be. (For example, we ask the question "does is the data from this experiment comfortably consistent with the standard model, or have we encountered something unexplained by that model?")

Say we decide that a $0.001$ probability of finding this data under the standard model is strange enough to get excited about.

In assessing significance, we have to be very careful to consider the "look elsewhere effect". For example, if the circumstance to possibly get excited about is a "bump" at some value of mass among a spectrum of masses measured in many particle collisions, and if Poisson statistics tells you that the chance of such a bump in that mass bin is one in five thousand, that will still not be a significant event because we would have been just excited it that same bump had happened at any of a hundred other mass bins -- the probability becomes one in fifty, which is not significant enough to report.

In your simple cases, I would say the analogous question to ask is to say that the non-strangeness of an event is the probability of the union of that event plus all other equally or more unlikely events. So if your model works with positive integers and says that $$ \forall n \in $\Bbb{Z}^+ : P(X = n) = 9\cdot (0.1)^n $$ then a result of $n=4$, which has a probability $0.0009$, has a non-strangeness of $0.001$, which may well be considered as a strange event. But if $$ \forall n \in $\Bbb{Z}^+ : P(X = n) = \frac{1}{99}\cdot (0.99)^n $$

then a result of $n=300$, which has a probability of about $0.0005$ (much lower that in our previous example), has a non-strangeness, by this measure, of $0.0495$ which if we consider our strangeness threshold to be $0.001$, is not a particularly strange or remarkable event.

0
On

Well, I'm not sure if this answers your question, since it is phrased imprecisely. However, you might like to consider typical sets (see chapter 3 in Elements of Information Theory, Cover & Thomas). Another thing which comes to mind is the Kullback-Leibler distance (see chapter 2), and in particular Sanov's theorem (chapter 11), which "gives a bound on the probability of observing an atypical sequence of samples from a given probability distribution". In essence, what (I think) you're looking for is a measure of the "surprise" in a particular (experimental?) result; this is best quantified using Information Theory tools, such as those mentioned above.

0
On

There's a lot of concepts that express this in statistics, machine learning, and information theory.

I believe your question is concerning a "maximum likelihood estimator," which looks at probability from a Bayesian point of view, or sort of backwards, to ask which possible cause would have the best chance of producing a given outcome, even if the outcome itself is very unlikely.

The first that comes to mind is how many standard deviations away from the mean something is.

Another is perplexity, which comes up in natural language processing, and measures how much the possibilities branch each time another event happens.

Information theory deals with entropy, which is how much information is gained from something happening. The intuition is that if something very strange happens, you learn a lot from witnessing it, whereas if something completely mundane happens, you learn nothing.