Quantifying the evenness-of-distribution of nodes within a necklace

172 Views Asked by At

Given a necklace with n nodes that are distributed around a circle by a set of given deltas: How would you quantify how evenly the nodes are distributed.

(By "evenly" I mean that each node is maximally far from its neighbors)

For example in Ex. 1 the set of deltas {1,1,2,2,2,2,2} can produce 3 necklaces with the following deltas:

1,1,2,2,2,2,2 (top of Ex. 1)
1,2,1,2,2,2,2
1,2,2,1,2,2,2 (bottom of Ex. 1)

How would you express that 1,1,2,2,2,2,2 is less evenly distributed than in 1,2,2,1,2,2,2 where the distances between the nodes is maximal?

What about a similar situation in Ex. 2?

EDIT:

A solution has been proposed below, which works for the majority of cases. But not for when there's an even number of all sizes.

The algorithm below, returns the same value for the following sets, even though they clearly have different levels of evenness of distribution:

[1 2 1 2 1 2 1 2] (as expected)

But also for: [1 1 2 2 1 1 2 2] [1 1 2 2 1 2 1 2] [1 1 2 1 2 2 1 2] [1 1 2 1 2 1 2 2] [1 1 2 1 1 2 2 2] [1 1 1 2 2 2 1 2] [1 1 1 2 2 1 2 2] [1 1 1 2 1 2 2 2]

enter image description here

3

There are 3 best solutions below

16
On BEST ANSWER

My proposal is to measure the evenness of the distribution by comparing the actual positions of the nodes to the positions of equally many nodes that are perfectly evenly spaced.

For example, suppose that we have "deltas" of $1, 1, 2, 2$. Then the positions of the four nodes, normalized so that the first node has position $0$ and the length of the circle is $1$, are at $0, \frac16, \frac36, \frac56$. Four evenly spaced nodes would be at positions $0, \frac14, \frac12, \frac34$. So we have a sequence of four differences: $$ \left(0, \frac16, \frac36, \frac56\right) - \left(0, \frac14, \frac12, \frac34\right) = \left(0, -\frac1{12}, 0, \frac1{12}\right). $$ In general, the signed average of these does not actually depend on the order of the "deltas". To see how big they tend to be, we could sum their absolute values, or (my choice) take the variance. As a bonus, it seems like the variance does not depend on the starting point (it gives the same result when we rotate the sequence of deltas).

The lower the variance, the more evenly the nodes are spaced.

For example, here are the permutations of [1,1,1,1,2,2,2,2] sorted by this parameter:

enter image description here

7
On

Let $A_i$ be the set that locates $i$ in the set of nodes $S$.

$$ S = 1, 1, 2, 2, 3, 3 \Rightarrow A_1 = \{1, 2\}, A_2=\{3,4\}, A_3=\{5,6\}$$ $$ S = 1, 2, 3, 1, 2, 3 \Rightarrow A_1 = \{1, 4\}, A_2=\{2,5\}, A_3=\{3,6\}$$

Let $U_i$ be the sum of the distances between the elements of $A_i$ so that it would measure how evenly distributed the $A_i$'s are. ($n=|S|$)

$$ U_i = \sum_{a,b \in A_i} \min(|a-b|, n-|a-b|) $$

Let $U$ be the sum of the $U_i$'s so that it would represent how evenly distributed $S$ is. $$ U = \sum U_i $$

Greater value of $U$ implies that $S$ is more evenly distributed.

1
On

What you are talking about here is the Kolmogorov complexity of the string of deltas.

The complexity of a string is the length of the shortest possible description of the string in some fixed universal language.

What you call "more evenly" means higher Kolmogorov complexity. What you call "less evenly" means lower Kolmogorov complexity.

This can be extended from binary strings to strings involving any finite number of distinct deltas.

Note added after comments discussion. Following the comments and correction, you might need to consider the sequence of transitions of distinct deltas. Then what you call "more evenly " are the sequences that have corresponding sequences of transitions of distinct deltas of higher Kolmogorov complexity.

In your first example, wr write 0 for a 1 -->2 transition, and 1 for a 2-->1 transition. Then the transition sequence of the first string is 10. The transition sequence of the second string is 010.

In your second example, we write 0 for a 1-->2 transition, 1 for a 2-->1 transition, 2 for a 2 -->3 transition , 3 for a 3-->2 transition, 4 for a 1-->3 transition, and 5 for a 3-->1 transition. Then in your second example, your first string has transition sequence 02, and your second string has transition sequence 02502.

In both cases, what you call "more evenly " sequences are the sequences with transition sequences of higher Kolmogorov complexity.