Why is entropy defined as it is?

91 Views Asked by At

I'm currently reading a Machine Learning textbook (by Tom Mitchell) and in the ID3 algorithm, they have used a term called entropy.
The entropy of a collection whose classification is boolean is defined as $$Entropy(S) \equiv -p_\oplus log_2 p_\oplus - p_\ominus log_2 p_\ominus$$ Why is it defined as this? I understand that it has properties like Entropy has to be zero if $p_\oplus = 0$ or $1$, and Entropy is $1$ when $p_\oplus = 0.5$ but there are other functions also which can work. For example, $$Entropy(S) = -4(p_\oplus-0.5)^2 + 1$$ Why do we use logaritmic functions to calculate entropy? Is there a reason why we prefer this?