Does the Information Gain algorithm favor a high-entropy attribute or a low-entropy one?

164 Views Asked by Bumbble Comm At 30 Mar 2026 - 12:04

This might not be mutual to mathematics but it does relate to Information-Theory.

My question is:

Does the InformationGain algorithm, in Decision-Tree machine-learning, favor a high-entropy attribute or a low-entropy one?

The source of my confusion is in the definition of Shannon's Function:

                           H = -SUM(pi*log2(pi))
                               /\--this MINUS-right here!

If this is the case then SURELY: gain = Hbefore - Hafter

Actually, means:

gain = Hbefore + Hafter

??... No, then have people just forgotten about the MINUS-sign??

Original Q&A

There are 1 best solutions below

Bumbble Comm On 08 Mar 2016 - 5:52 BEST ANSWER

The minus sign is NOT a subtraction. It is negative. The reason a negative sign is there is because we are taking logarithm of probabilities.

Do it in your calculator, what is the base 2 logarithm of 0.5? That's right, it is -1. In order to make the information of a random variable that take 50% chance 0 and 50% chance 1, we need to take the negative of the logarithm to make it work.

Does the Information Gain algorithm favor a high-entropy attribute or a low-entropy one?

There are 1 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in INFORMATION-THEORY

Related Questions in MACHINE-LEARNING

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions