Classification of 32 bit integers into 2 classes with uneven probability

45 Views Asked by At

I've been given a set of integer data $x_t$ that are all 32-bit unsigned integers. These data have been previously divided into two classes using a function unknown to me as the following:

$$f(x)=\begin{cases}a, & \text{classed as $a$ for $25$ percent of data}\\b, & \text{classed as $b$ for $75$ percent of data}\end{cases}$$

As you see, the probability of this classification is uneven. It means that more data are classified as $b$ that of $a$.

The data available now includes a set of $1000$ entries of random looking 32-bit integers $x_t$ with their corresponding classification $c_t$.

The question is what clustering or classification algorithm might best suite to guess the class of any given $x \notin x_t$ for sure or at least with probability higher than 90 percent?

Is the fact that $f(x)$ classifies inputs unevenly relevant here? What if $f(x)$ classifies inputs evenly with equal probability of 50 percent? Will that effect the complexity of the problem and then proper algorithms to use?