A mathematical measure of variety or inequality

401 Views Asked by At

Suppose I have X different objects and each of them contains a certain number of items.

Is there a mathematical measure that shows how equally those items are distributed among those objects?

For example, a measure that would equal 1 when the distribution is perfectly equal (e.g. 5 items in each object) and 0 when the distribution is completely unequal (e.g. one object contains all the items). And something that would show a degree of unequality if let's say 2 objects concentrated most of the items (e.g. 10 and 11 items in the first two objects and only 3-5 items in the other 3 objects would result in a higher degree of unequality.

Thank you!

1

There are 1 best solutions below

1
On BEST ANSWER

Entropy! A common way of measuring something like this would be to write this in terms of ratios and probabilities. I'll use your examples:

  1. Let's say we have $n$ buckets, with $5$ items in each. There are $5n$ objects in total, so to each bucket $i$, we can find a probability $p_i$ of a randomly chosen item appearing in bucket $i$. Given a particular item, we know it is equally likely to appear in each bucket, so $p_i = 1/n$, for each $i$. Then entropy is given as $$H = -\sum_{i=1}^n p_i \log p_i$$ which in this case is calculated to be $H = \log n$. In fact, we can prove that $H \leq \log n$ for any probabilities $p_i$, so this is the most equal distribution.

  2. Conversely, suppose there are $n$ buckets, and all $10$ items are in the first bucket. Then $p_1 = 1$, and $p_i = 0$ for every other $2 \leq i \leq n$. (By convention, take $0 \log 0 = 0$). Then, $H = 0$ and we have a very unequal situation. We can also show that $H \geq 0$ in the general case, so this is the most unequal distribution.

  3. Finally, for a middling example, let's say we have $n=5$ buckets, and the items are distributed $10,11,4,3,5$. Then, we have a total of $33$ items, and so the probabilities look like $\frac{10}{33},\frac{11}{33},\frac{4}{33}, \frac{3}{33},\frac{5}{33}$. Then using the formula above, we get $H = 1.487...$, which is in between the maximum of $1.609...$ and minimum of $0$.