How to know the result of entropy function using uniform distribution set

101 Views Asked by At

In the entropy function here

$H(s) = -\sum P(class=i|S)log_2{P(class=i|S)}$

I am trying to understand what is the domain of it's output for any input. I know that given a set where the frequency of 1 unique item is 100% and the frequency of every other unique item is 0%, $H(S)=0$.

But what if the given set, was such that every unique item was equally frequent, how can you know what the result of $H(S)$ is right away without computing it manually. Is there another formula or relation to get this?

I made a quick python script to test different fully uniform distribution sets to see what the output is (see below), but I couldn't find any relationship of the output result to any input variables.

Does anyone know about this?

Thanks

import math

for c in range(1, 101):
    a = 10.0
    b = a * c

    s = 0
    for i in range(c):
        s += (a/b) * math.log(a/b, 2)
    s = -s

    print s

output

-0.0
1.0
1.58496250072
2.0
2.32192809489
2.58496250072
2.80735492206
3.0
3.16992500144
3.32192809489
3.45943161864
3.58496250072
3.70043971814
3.80735492206
3.90689059561
4.0
4.08746284125
4.16992500144
4.24792751344
4.32192809489
4.39231742278
4.45943161864
4.52356195606
4.58496250072
4.64385618977
4.70043971814
4.75488750216
4.80735492206
4.85798099513
4.90689059561
4.95419631039
5.0
5.04439411936
5.08746284125
5.12928301694
5.16992500144
5.20945336563
5.24792751344
5.28540221886
5.32192809489
5.35755200462
5.39231742278
5.4262647547
5.45943161864
5.49185309633
5.52356195606
5.55458885168
5.58496250072
5.61470984412
5.64385618977
5.67242534197
5.70043971814
5.72792045456
5.75488750216
5.78135971352
5.80735492206
5.83289001416
5.85798099513
5.88264304936
5.90689059561
5.93073733756
5.95419631039
5.9772799235
6.0
6.02236781303
6.04439411936
6.06608919046
6.08746284125
6.10852445678
6.12928301694
6.1497471195
6.16992500144
6.18982455888
6.20945336563
6.2288186905
6.24792751344
6.26678654069
6.28540221886
6.30378074818
6.32192809489
6.33985000288
6.35755200462
6.37503943135
6.39231742278
6.40939093614
6.4262647547
6.44294349585
6.45943161864
6.47573343097
6.49185309633
6.5077946402
6.52356195606
6.53915881111
6.55458885168
6.56985560833
6.58496250072
6.59991284219
6.61470984412
6.62935662008
6.64385618977
1

There are 1 best solutions below

2
On BEST ANSWER

If there are $n$ unique items in the set $S$ each with equal (conditional) probability of occurrence then: $$\begin{align} \mathsf H(S) & = - \sum_{i=1}^n \tfrac 1 n \log_2(\tfrac 1 n) \\[2ex] & = \log_2 n \end{align}$$