I have a set of 5 integer numbers {1,23, 17, 33, 35}. Elements can take values only from [1..36], and happen only once within the set.
What math can I use to understand, wether the numbers are distributed "evenly" (means very symmetric with respect to 18 - like ([1,2,18,35,36]) or "cluttered to one side" ([1,2,3,30,31], [7,9,17,16,36]) within single given set of 5 numbers? "cluttered to the left" - means there are more small numbers - below 18 (say 3, 4, 5 numbers are below 18).
I need to analyze many such sets (assigning "evenly"/"cluttered" value to each) and then understand what happens more often. Besides, such indicator must show
- Numbers tend to be cluttered on the left or on the right ([1,2,3,30,31], [7,9,17,16,36]).
- Numbers tend to be close to 18 [16,15,18,19,20]
I think of variance and standard deviation, but I am not sure - maybe there are better applicable or more advanced indicators/analysis methods.
P.S. Seems standard deviation is not helping, or I cannot understand how to use it:
- std([1,2,18,35,36]) = 15.21315220458929 ("evenly" distributed)
- std([1,2,3,30,31]) = 13.97998569384104 ("cluttered/skewed" to the left)
- std([7,9,16,17,36]) = 10.25670512396647 ("cluttered/skewed" to the left)
- std([1,30,31,32,33]) = 12.24091499847948 ("cluttered/skewed" to the right)
Besides, non-parametric skew can be used - it is within [-1..1] and is zero if values are symmetric with respect to the "middle".
An easy fast way to check:
Order $x_1 < x_2 < x_3 < x_4 < x_5$, then consider
\begin{equation} \sum_{i=1}^5\left(|x_{i}| - |37-x_i|\right) \end{equation}
If they are 'evenly distributed', this sum is close to 0. Worst case is $\pm 155$. You can set a treshold somewehere in between.