Guessing the color of a point in the unit square by looking at the colors of its neighbors

62 Views Asked by At

Let's look at the unit square, and color some of its points red and the others blue.

Now we sample n points from some distribution on the square, and guess the color of a new point using the colors of the points of the sample. We do this in two ways:

First way: we guess the color of the point using the color of the nearest point to it on the sample.

Second way: we guess the color of the point using a majority vote between the 2k+1 points from the sample that are nearest to it.

Conjecture: the expected accuracy of the second method is at most $\frac 1 8$ more than the accuracy of the first method.

We are free to choose the coloring, n and k. The largest difference in the accuracy I found is the following:

We color $\frac 1 4 $ of the square red and the rest blue very uniformly, so that the red and blue mix very well, and choose k large, n=2k+1. Then the first method has mistake rate $0.25 \times 0.75+0.75 \times 0.25=37.5$, since with probability $\frac 1 4 $ a point will be red and will be wrong $0.75$ of the time, since the distance is independent of the coloring (so the closest point has random color). The second method will have a mistake rate of $0.25$, since we always predict the more prevalent color.

The constant $0.25$ was chosen so it optimizes this kind of attempt. Is there a better one?