Classification cost in growing a classification tree

18 Views Asked by At

This question is based in machine learning, but is heavily mathematic so I am posting here. PiHatC

This equation is used to estimate the class conditional probabilities of a certain leaf of data D in a classification tree. I am trying to determine first what the denomination outside of the summation means? In my textbook it states that "[shows eqn]..where D is data in the leaf." The absolute value of the data, which doesn't make sense to me?

Also, is the symbol directly to the right of the summation mean the same thing as summation but multiplication instead of addition? If so that doesn't make sense. My other guess is it would have to be something along the lines of the probability that Yi is class c?

1

There are 1 best solutions below

0
On BEST ANSWER

The symbol $|\mathcal{D}|$ means "number of elements in $\mathcal{D}$". The symbol $\mathbb{I}(y_i=c)$ represents the indicator function: it is $1$ if the class of the sample $i$ equals $c$ otherwise it is $0$.