InformationGain on Two Continuous classes instead of binary

58 Views Asked by Bumbble Comm At 26 Mar 2026 - 4:03

I've a problem regarding an exercise with information gain. I can't seem to get the right answer, because the exersises differs from what we learned. Usually, a target class is a binary variable (skiing:yes or no). However in this example, its two classes with a number of instances.

So what I tried was the following: make the number of instances a binary class (Y>O).

To calculate Information gain I then

G(M,D) = H(M) - (6/12 * H(d1) -  6/12 * H(d2)
G(M,D) = H(7/12, 5/12) - 6/12 * H(5/6,2/6) - 6/12* H(3/6,4/6)
due to the Y:2, O:2 entry, it does not add up to 1

How should I approach this?

Dataset

Original Q&A

There are 1 best solutions below

Bumbble Comm On 13 Jul 2014 - 10:12 BEST ANSWER

Information gain is the difference between original entropy in the output class and conditional entropy in the output class when you condition on the variable you are calculating the information gain for. Your conditioning variable $D$ has two values, $d_1$ and $d_2$. If $Y$ is the number of young and $N$ is the total number of data points, then the original entropy is $H(X) = p \log p + (1-p) \log (1-p)$ where $p = Y/N$. Similarly assume you have $Y_1$ young when $D = d_1$ and $N_1$ total when $D = d_1$, and similarly you have $Y_2$ and $N_2$ when $D = d_2$. Then the conditional entropy is $H(X|D) = q_1(p_1 \log p_1 + (1 - p_1) \log (1 - p_1)) + q_2(p_2 \log p_2 + (1 - p_2) \log (1 - p_2))$ where $q_1 = N_1 / N$, $q_2 = N_2 / N$, $p_1 = Y_1 / N_1$ and $p_2 = Y_2 / N_2$. And then $H(X) - H(X|D)$ is the information gain for variable $D$.

InformationGain on Two Continuous classes instead of binary

There are 1 best solutions below

Related Questions in MACHINE-LEARNING

Related Questions in ENTROPY

Trending Questions

Popular # Hahtags

Popular Questions