Class Boundaries Definition of discrete Data

660 Views Asked by At

why do we add and subtract 0.5 due to the definition . What is reason for this? One of the definitions can be found here: http://tistats.com/definitions/class-boundaries/ Thank you

1

There are 1 best solutions below

2
On BEST ANSWER

Note that the claim is not that the data is discrete, but that the class labels are discrete. At your link, the answer to your question is "while preventing gaps" -- every possible value of the data ends up assigned to a class. (Sometimes you don't know that your real-world data is precisely discrete, approximately discrete, or continuous. It would be comforting to know that, regardless of which of those occurs, your process assigns a class to every possible data value.)

Say my two classes are "$1$" and "$2$" but my variable, $x$ can take any value in $[0,3]$. Suppose I make a measurement and find $x = 0.1$. Pretty clearly, this should go in the class "$1$". (I put classes in quotation marks because they are labels for ranges of values.) But what about values between $1$ and $2$? We have to put a boundary somewhere to divide values belonging to the class "$1$" and values belonging to the class "$2$". In the absence of any other reason, why not but the boundary halfway from $1$ to $2$. Then values in the interval $[0,1.5)$ are assigned to the class "$1$" and values in the interval $(1.5,3]$ are assigned to the class "$2$".

What about the value $1.5$? There are two standard ways to deal with it

  • Method 1: On the one hand, the probability of a continuous variable having the value $1.5$ exactly is zero, so we need not concern ourselves with this eventuality.
  • Method 2: ... Well, if that's true assign it to either class, since the probability of that choice affecting the assignment of a value is zero (by the argument used in method 1).

So just pick whichever class and attach $1.5$ to its interval.